帮助与文档 > 产品文档 > 语音识别ASR > API文档 > 长语音转写服务
长语音转写服务

长语音转写简介

说明

Hi,您好,欢迎使用有道智云长语音转写接口服务。

本文档主要针对需要集成HTTP API的技术开发工程师,详细描述长语音转写能力相关的技术内容。

如果您有与我们商务合作的需求,可以通过以下方式联系我们:

商务邮箱: AIcloud_Business@corp.youdao.com

如果您对文档内容有任何疑问,可以通过以下几种方式联系我们:

客服QQ:1906538062

智云翻译技术交流QQ 1群: 652880659

智云翻译技术交流QQ 2群: 669384425

智云翻译技术交流QQ 3群: 807539209

智云翻译技术交流QQ 4群: 936752411

联系邮箱: zhiyun@corp.youdao.com

温馨提示:

  • 本文档主要针对开发人员,接入测试前需要获取应用ID和应用密钥,并绑定实例;如果您还没有,请按照 新手指南 获取。
  • 平台向每个账户赠送50元的体验金,供用户集成前测试所用。

接口说明

长语音转写服务,将一小时内的音频数据转换成文本数据,为信息处理和数据挖掘提供基础。您只需要通过调用长语音转写 API就可以将音频数据转换成文本数据。

协议须知

调用方在集成长语音转写API时,请遵循以下规则。

规则描述
传输方式HTTPS
请求方式POST
字符编码统一使用UTF-8 编码
请求格式表单
响应格式JSON

接口调用参数

长语音转写线上调用地址:

http://openapi.youdao.com

音频格式

格式支持:wav(不压缩,pcm编码,采样率:推荐16k ,编码:16bit位深的单声道),aac,mp3

格式代码
wavwav
mp3mp3
aacaac

文件上传接口定义

预处理接口

  • 接口uri:/api/audio/prepare

接口参数:

参数名称类型含义是否必填示例
salttextuuid(唯一通用识别码)Trueuuid,唯一通用识别码
typetext上传类型, 仅支持base64上传,请填写固定值1True1
appKeytext应用IDTrue可在 应用管理 查看
sliceNumstring文件分片数目(建议分片大小为10M,若文件<10M,则slice_num=1True1
nametext文件名称Truetest
fileSizetext文件大小True15234
curtimetext时间戳True秒数
langTypetext源语言True支持语言列表
signtext签名Truesha256(应用ID+salt+curtime+应用密钥)
signTypetext签名版本Truev4
formattext语音文件的格式,wavTruemp3/wav/aac/m4a/amr
  • 响应结果
  •   {    
      "errorCode": "0", //错误码   
      "msg": "success", //信息说明    
      "result": "383e72a47557490aa05a344074117a9d" //返回的任务ID
      }

文件分片上传接口

  • 接口uri:/api/audio/upload
  • 接口说明:

预处理成功,调用文件上传接口;按预处理设置的分片信息(sliceNum)依次上传音频切片(文件以二进制方式multipart读取上传),

直到全部切片上传成功(如预处理时 sliceNum=2,则需将音频切分成两部分,sliceId=1和2,并按顺序调用该接口);

上一切片成功上传,才可进行下一切片的上传操作。调用过程中若出现异常,可重试若干次。

  • 接口参数:
参数名称类型含义是否必填示例
qtexttaskidTrue任务id
appKeytext应用IDTrue可在 应用管理 查看
salttextUUID(唯一识别通用码)TrueUUID
curtimetext时间戳(秒)True
signtext签名Truesha256(应用ID+salt+curtime+应用密钥)
signTypetext签名版本Truev4
sliceldtext语音文件分片的IdTrue1
filetext文件Truemultipart的name为file
typetext上传的类型false1
  • 响应结果
{    
    "errorCode": "0", //错误码    
    "msg": "success", //信息说明    
    "result": null //返回的任务ID
}

合并文件

  • 接口uri:/api/audio/merge
  • 接口说明:

全部文件切片上传成功后,调用该接口,通知服务端进行文件合并与转写操作。 该接口不会返回转写结果,而是通知服务端将任务列入转写计划。转写的结果通过 getResult 接口获取。

  • 接口参数:
参数名称类型含义是否必填示例
qtexttaskidTrue任务id
appKeytext应用IDTrue可在 应用管理 查看
salttextUUID(唯一识别通用码)TrueUUID
curtimetext时间戳(秒)True
signtext签名Truesha256(应用ID+salt+curtime+应用密钥)
signTypetext签名版本Truev4
  • 响应结果
{
    "errorCode": "0", //错误码
    "msg": "success", //信息说明
    "result": null //返回的任务ID
}

查询转写进度

  • 接口uri:/api/audio/get_progress
  • 接口说明:

在调用方发出合并文件请求后,服务端已将任务列入计划。在获取结果前,调用方需轮询该接口查询任务当前状态。 当且仅当任务状态=(转写结果上传完成),才可调用获取结果接口获取转写结果。 轮询策略由调用方决定,建议每隔10分钟轮询一次。状态码说明见附录。

  • 接口参数:
参数名称类型含义是否必填示例
qtexttaskidTrue任务id
appKeytext应用IDTrue可在 应用管理 查看
salttextUUID(唯一识别通用码)TrueUUID
curtimetext时间戳(秒)True
signtext签名Truesha256(应用ID+salt+curtime+应用密钥)
signTypetext签名版本Truev4
  • 响应结果
{
    "errorCode": "0", //错误码
    "msg": "success", //信息说明
    "result": [{"status":"9","taskId":"xxxx"}]//处理状态:processing;complete
}

查询转写结果

  • 接口uri:/api/audio/get_result
  • 接口说明:

当任务处理进度状态为完成时(见查询处理进度接口),调用该接口获取转写结果。这是转写流程的最后一步。 转写结果各字段的详细说明见转写结果说明文档。 服务端也支持主动回调,转写完成之后主动发送转写结果到用户配置的回调地址

  • 接口参数:
参数名称类型含义是否必填示例
qtexttaskidTrue任务id
appKeytext应用IDTrue可在 应用管理 查看
salttextUUID(唯一识别通用码)TrueUUID
curtimetext时间戳(秒)True
signtext签名Truesha256(应用ID+salt+curtime+应用密钥)
signTypetext签名版本Truev4
  • 响应结果
{
    "result": [{
        "sentence": "Have a good day.",
        "vad_id": 1,
        "word_timestamps": [70, 270, 390, 710],
        "word_timestamps_eds": [190, 390, 510, 830],
        "words": ["Have", "a", "good", "day."],
        "partial": false}],
    "errorCode": "0"
}

常用语言demo

java示例


import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.google.common.base.Preconditions;
import org.apache.http.HttpEntity;
import org.apache.http.NameValuePair;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.mime.HttpMultipartMode;
import org.apache.http.entity.mime.MultipartEntityBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.util.EntityUtils;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.*;

public class Test {

    private static final int SLICE_SICE = 10485760; // 10M
    private static String hostUrl = "https://openapi.youdao.com";
    static String appKey = "您的appKey";
    static String appSecret = "您的密钥";
    static String langType = "源语言";
    public static void main(String[] args) throws Exception {
        File audioFile = new File("path of audio");
        try (FileInputStream fis = new FileInputStream(audioFile)) {
            // 预处理
            String taskId = prepare(audioFile, langType);

            // 分片上传文件
            int len = 0;
            byte[] slice = new byte[SLICE_SICE];
            int sliceId = 1;
            while ((len =fis.read(slice)) > 0) {
                // 上传分片
                if (fis.available() == 0) {
                    slice = Arrays.copyOfRange(slice, 0, len);
                }
                upload(taskId, sliceId++, slice);
            }

            // 合并文件
            merge(taskId);

            // 轮询获取任务结果
            while (true) {
                try {
                    System.out.println("sleep a while ..." );
                    Thread.sleep(20000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                String processResult = getProcess(taskId);
                JSONObject taskProgress = JSONObject.parseObject(processResult);
                if ("0".equals(taskProgress.getString("errorCode"))) {
                    JSONArray resultArray = taskProgress.getJSONArray("result");
                    Integer taskStatus = resultArray.getJSONObject(0).getInteger("status");

                    if (taskStatus == 9) {
                        System.out.println("任务完成!");
                        break;
                    }

                    System.out.println("任务处理中:" + taskStatus);
                } else {
                    System.out.println("获取任务进度失败!");
                }
            }
            // 获取结果
            System.out.println("\r\n\r\n转写结果: " + getResult(taskId));
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static String prepare(File audioFile, String langType) throws Exception {

        long fileSize = audioFile.length();
        int sliceNum = (int) Math.ceil(fileSize * 1.0 / SLICE_SICE);
        String fileName = audioFile.getName();
        String suffix = fileName.substring(fileName.indexOf(".") + 1, fileName.length());
        String curtime = String.valueOf(System.currentTimeMillis()/1000);
        String salt = UUID.randomUUID().toString();
        String signStr = appKey + salt + curtime +appSecret;
        String sign = encrypt(signStr, null);
        Map<String, String> map = new HashMap<>();
        String url = hostUrl + "/api/audio/prepare";
        map.put("appKey", appKey);
        map.put("name", fileName);
        map.put("format", suffix);
        map.put("type", "1");
        map.put("langType", langType);
        map.put("curtime", curtime);
        map.put("salt", salt);
        map.put("sign", sign);
        map.put("signType", "v4");
        map.put("sliceNum", String.valueOf(sliceNum));
        map.put("fileSize", String.valueOf(fileSize));
        String result = requestForHttp(url, map);
        JSONObject resultObject = JSONObject.parseObject(result);
        if (!"0".equals(resultObject.getString("errorCode"))) {
            throw new RuntimeException("预处理失败!" + result);
        }
        String taskId = resultObject.getString("result");
        System.out.println("预处理成功, taskid:" + taskId);
        return taskId;
    }

    /**
     * 获取MessageDigest的加密结果
     * @param strSrc
     * @param encName
     * @return
     * @throws NoSuchAlgorithmException
     */
    public static String encrypt(String strSrc, String encName) throws NoSuchAlgorithmException {
        Preconditions.checkNotNull(strSrc);
        byte[] bt = strSrc.getBytes();
        if (encName == null || "".equals(encName)) {
            encName = "SHA-256";
        }
        MessageDigest md = MessageDigest.getInstance(encName);
        md.update(bt);
        return bytes2Hex(md.digest());
    }

    public static String bytes2Hex(byte[] bts) {
        String des = "";
        String tmp = null;
        for (int i = 0; i < bts.length; i++) {
            tmp = (Integer.toHexString(bts[i] & 0xFF));
            if (tmp.length() == 1) {
                des += "0";
            }
            des += tmp;
        }
        return des;
    }

    public static String merge(String task) throws Exception {

        String curtime = String.valueOf(System.currentTimeMillis()/1000);
        String salt = UUID.randomUUID().toString();
        String signStr = appKey + salt + curtime +appSecret;
        String sign = encrypt(signStr, null);
        Map<String, String> map = new HashMap<>();
        String url = hostUrl + "/api/audio/merge";
        map.put("appKey", appKey);
        map.put("q", task);
        map.put("curtime", curtime);
        map.put("salt", salt);
        map.put("sign", sign);
        map.put("signType", "v4");
        String result = requestForHttp(url, map);
        JSONObject resultObject = JSONObject.parseObject(result);
        if (!"0".equals(resultObject.getString("errorCode"))) {
            throw new RuntimeException("合并音频失败!" + result);
        }
        return result;
    }

    public static String getProcess(String task) throws Exception {
        String curtime = String.valueOf(System.currentTimeMillis()/1000);
        String salt = UUID.randomUUID().toString();
        String signStr = appKey + salt + curtime +appSecret;
        String sign = encrypt(signStr, null);
        Map<String, String> map = new HashMap<>();
        String url = hostUrl + "/api/audio/get_progress";
        map.put("appKey", appKey);
        map.put("q", task);
        map.put("curtime", curtime);
        map.put("salt", salt);
        map.put("sign", sign);
        map.put("signType", "v4");
        String result = requestForHttp(url, map);
        JSONObject resultObject = JSONObject.parseObject(result);
        if (!"0".equals(resultObject.getString("errorCode"))) {
            throw new RuntimeException("获取任务处理状态失败!" + result);
        }
        return result;
    }

    public static String getResult(String task) throws Exception {
        String curtime = String.valueOf(System.currentTimeMillis()/1000);
        String salt = UUID.randomUUID().toString();
        String signStr = appKey + salt + curtime +appSecret;
        String sign = encrypt(signStr, null);
        Map<String, String> map = new HashMap<>();
        String url = hostUrl + "/api/audio/get_result";
        map.put("appKey", appKey);
        map.put("q", task);
        map.put("curtime", curtime);
        map.put("salt", salt);
        map.put("sign", sign);
        map.put("signType", "v4");
        String result = requestForHttp(url, map);
        JSONObject resultObject = JSONObject.parseObject(result);
        if (!"0".equals(resultObject.getString("errorCode"))) {
            throw new RuntimeException("获取结果失败!" + result);
        }
        return result;
    }

    public static String upload(String task, Integer sliceId, byte[] slice) throws Exception {
        String curtime = String.valueOf(System.currentTimeMillis()/1000);
        String salt = UUID.randomUUID().toString();
        String signStr = appKey + salt + curtime +appSecret;
        String sign = encrypt(signStr, null);
        Map<String, String> map = new HashMap<>();
        String url = hostUrl + "/api/audio/upload";
        map.put("appKey", appKey);
        map.put("type", "1");
        map.put("curtime", curtime);
        map.put("salt", salt);
        map.put("sign", sign);
        map.put("signType", "v4");
        map.put("q", task);
        map.put("sliceId", String.valueOf(sliceId));
        String result = requestForHttp(url, map, slice);
        JSONObject resultObject = JSONObject.parseObject(result);
        if (!"0".equals(resultObject.getString("errorCode"))) {
            throw new RuntimeException("上传音频失败!" + result);
        }
        return result;
    }
    public static String requestForHttp(String url, Map<String,String> requestParams) throws Exception{
        String result = null;
        CloseableHttpClient httpClient = HttpClients.createDefault();
        /**HttpPost*/
        HttpPost httpPost = new HttpPost(url);
        List<NameValuePair> params = new ArrayList<NameValuePair>();
        Iterator<Map.Entry<String, String>> it = requestParams.entrySet().iterator();
        while (it.hasNext()) {
            Map.Entry<String, String> en = it.next();
            String key = en.getKey();
            String value = en.getValue();
            if (value != null) {
                params.add(new BasicNameValuePair(key, value));
            }
        }
        httpPost.setEntity(new UrlEncodedFormEntity(params,"UTF-8"));
        /**HttpResponse*/
        CloseableHttpResponse httpResponse = httpClient.execute(httpPost);
        try{
            HttpEntity httpEntity = httpResponse.getEntity();
            result = EntityUtils.toString(httpEntity, "utf-8");
            EntityUtils.consume(httpEntity);
        }finally{
            try{
                if (httpResponse!=null) {
                    httpResponse.close();
                }
            } catch (IOException e) {
                System.out.println("## release resouce error ##" + e);
            }
        }
        return result;
    }



    public static String requestForHttp(String url, Map<String,String> requestParams, byte[] slice) throws Exception{
        String result = null;
        CloseableHttpClient httpClient = HttpClients.createDefault();
        url = getUrlWithQueryString(url, requestParams);
        /**HttpPost*/
        HttpPost httpPost = new HttpPost(url);
        HttpEntity data = MultipartEntityBuilder.create()
                .setMode(HttpMultipartMode.BROWSER_COMPATIBLE)
                .addBinaryBody("file", slice, ContentType.IMAGE_JPEG, "audio")
                .build();
        httpPost.setEntity(data);

        /**HttpResponse*/
        CloseableHttpResponse httpResponse = httpClient.execute(httpPost);
        try {
            HttpEntity httpEntity = httpResponse.getEntity();
            result = EntityUtils.toString(httpEntity, "utf-8");
            EntityUtils.consume(httpEntity);
        } finally {
            try {
                if (httpResponse!=null) {
                    httpResponse.close();
                }
            } catch (IOException e) {
                System.out.println("## release resouce error ##" + e);
            }
        }
        return result;
    }

    /**
     * 根据api地址和参数生成请求URL
     *
     * @param url
     * @param params
     * @return
     */
    public static String getUrlWithQueryString(String url, Map<String, String> params) {
        if (params == null) {
            return url;
        }

        StringBuilder builder = new StringBuilder(url);
        if (url.contains("?")) {
            builder.append("&");
        } else {
            builder.append("?");
        }

        int i = 0;
        for (String key: params.keySet()) {
            String value = params.get(key);
            if (value == null) { // 过滤空的key
                continue;
            }

            if (i != 0) {
                builder.append('&');
            }

            builder.append(key);
            builder.append('=');
            builder.append(encode(value));

            i++;
        }

        return builder.toString();
    }

    /**
     * 进行URL编码
     *
     * @param input
     * @return
     */
    public static String encode(String input) {
        if (input == null) {
            return "";
        }

        try {
            return URLEncoder.encode(input, "utf-8");
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }

        return input;
    }

}

python3示例

# -*- coding: utf-8 -*-
# 
#
# 非实时转写调用demo

import hashlib
import json
import os
import time
import uuid

import requests

asr_host = 'https://openapi.youdao.com/api/audio'

# 请求的接口名
api_prepare = '/prepare'
api_upload = '/upload'
api_merge = '/merge'
api_get_progress = '/get_progress'
api_get_result = '/get_result'
# 文件分片大小10M
file_piece_sice = 10485760

class RequestApi(object):
    def __init__(self, app_key, app_secret, upload_file_path, lang):
        self.app_key = app_key
        self.app_secret = app_secret
        self.upload_file_path = upload_file_path
        self.lang = lang

    def encrypt(signStr):
        hash = hashlib.sha256()
        hash.update(signStr.encode('utf-8'))
        return hash.hexdigest()

    def gene_params(self, apiname, taskid=None, slice_id=None):
        app_key = self.app_key
        app_secret = self.app_secret
        upload_file_path = self.upload_file_path
        nonce = str(uuid.uuid1())
        curtime = str(int(time.time()))
        signStr = app_key + nonce + curtime + app_secret
        hash = hashlib.sha256()
        hash.update(signStr.encode('utf-8'))
        sign = hash.hexdigest()
        file_len = os.path.getsize(upload_file_path)
        file_name = os.path.basename(upload_file_path)
        format = os.path.splitext(upload_file_path)[-1][1:]
        print(file_name)

        param_dict = {}

        if apiname == api_prepare:
            # slice_num是指分片数量,如果您使用的音频都是较短音频也可以不分片,直接将slice_num指定为1即可
            slice_num = int(file_len / file_piece_sice) + (0 if (file_len % file_piece_sice == 0) else 1)
            param_dict['appKey'] = app_key
            param_dict['sign'] = sign
            param_dict['curtime'] = curtime
            param_dict['salt'] = nonce
            param_dict['signType'] = "v4"
            param_dict['langType'] = self.lang
            param_dict['fileSize'] = str(file_len)
            param_dict['name'] = file_name
            param_dict['format'] = format
            param_dict['sliceNum'] = str(slice_num)
        elif apiname == api_upload:
            param_dict['appKey'] = app_key
            param_dict['sign'] = sign
            param_dict['curtime'] = curtime
            param_dict['salt'] = nonce
            param_dict['signType'] = "v4"
            param_dict['q'] = taskid
            param_dict['sliceId'] = slice_id
        elif apiname == api_merge:
            param_dict['appKey'] = app_key
            param_dict['sign'] = sign
            param_dict['curtime'] = curtime
            param_dict['salt'] = nonce
            param_dict['signType'] = "v4"
            param_dict['q'] = taskid
        elif apiname == api_get_progress or apiname == api_get_result:
            param_dict['appKey'] = app_key
            param_dict['sign'] = sign
            param_dict['curtime'] = curtime
            param_dict['salt'] = nonce
            param_dict['signType'] = "v4"
            param_dict['q'] = taskid
        return param_dict

    def gene_request(self, apiname, data, files=None, headers=None):
        response = requests.post(asr_host + apiname, data=data, files=files, headers=headers)
        result = json.loads(response.text)
        print(result)
        if result["errorCode"] == "0":
            print("{} success:".format(apiname) + str(result))
            return result
        else:
            print("{} error:".format(apiname) + str(result))
            exit(0)
            return result

    # 预处理
    def prepare_request(self):
        return self.gene_request(apiname=api_prepare,
                                 data=self.gene_params(api_prepare))

    # 上传
    def upload_request(self, taskid, upload_file_path):
        file_object = open(upload_file_path, 'rb')
        try:
            index = 1
            while True:
                content = file_object.read(file_piece_sice)
                if not content or len(content) == 0:
                    break
                files = {
                    "file": content
                }
                response = self.gene_request(api_upload,
                                             data=self.gene_params(api_upload, taskid=taskid,
                                                                   slice_id=index),
                                             files=files)
                if response.get('errorCode') != "0":
                    # 上传分片失败
                    print('upload slice fail, response: ' + str(response))
                    return False
                print('upload slice ' + str(index) + ' success')
                index += 1
        finally:
            'file index:' + str(file_object.tell())
            file_object.close()
        return True

    # 合并
    def merge_request(self, taskid):
        return self.gene_request(api_merge, data=self.gene_params(api_merge, taskid=taskid))

    # 获取进度
    def get_progress_request(self, taskid):
        return self.gene_request(api_get_progress, data=self.gene_params(api_get_progress, taskid=taskid))

    # 获取结果
    def get_result_request(self, taskid):
        return self.gene_request(api_get_result, data=self.gene_params(api_get_result, taskid=taskid))

    def all_api_request(self):
        # 1. 预处理
        pre_result = self.prepare_request()
        taskid = pre_result["result"]
        print(taskid)
        # 2 . 分片上传
        self.upload_request(taskid=taskid, upload_file_path=self.upload_file_path)
        # 3 . 文件合并
        self.merge_request(taskid=taskid)
        # 4 . 获取任务进度
        while True:
            # 每隔20秒获取一次任务进度
            progress = self.get_progress_request(taskid)
            progress_dic = progress
            if progress_dic['errorCode'] != "0":
                print('task error: ' + progress_dic['failed'])
                return
            else:
                result = progress_dic['result']
                print(result[0])
                if result[0]['status'] == '9':
                    print('task ' + taskid + ' finished')
                    break
                print('The task ' + taskid + ' is in processing, task status: ' + str(result))

            # 每次获取进度间隔20S
            time.sleep(20)
        # 5 . 获取结果
        self.get_result_request(taskid=taskid)


# 注意:如果出现requests模块报错:"NoneType" object has no attribute 'read', 请尝试将requests模块更新到2.20.0或以上版本(本demo测试版本为2.20.0)
# 输入有道智云开放平台的应用Id(appKey),密钥和待转写的文件路径
if __name__ == '__main__':
    api = RequestApi(app_key="您的应用Id", app_secret="您的应用密钥", upload_file_path=r"D:/havea.wav", lang="en")
    api.all_api_request()

支持语言

英文名中文名代码
English (US)英语(美国)en
Mandarin (China)普通话(中国)zh-CHS

任务状态码

状态ID状态描述
0任务创建成功
1音频上传完成
2音频合并完成
3音频转写中
4转写结果处理中
5转写完成
6音频转写失败,需要重试
9转写结果上传完成
12音频转码失败

状态码

状态码状态码含义
4000000q不正确
4000001文件大小不匹配
4000002文件为空
4000003文件名称为空
4000004不支持的格式
4000005sliceNum不正确
4000006sliceId不正确
4000007不支持的文件大小
4000008不支持的语言
4000009任务不存在
4000010文件上传失败
4000011文件大小校验错误
4035001转写结果下载失败