TTS API
Overview
Converts text into natural speech in real-time, providing high-quality sound output for virtual humans and various intelligent terminal applications.
Ⅰ API Description
1 Authentication
1.1 X-TOKEN Calculation
Interface Received Parameters
1. Data Body: data = {"xxxx": "xxxx"} or {}
2. Key assigned to external users: secret = "iamsecret"
3. Interface Method Path (excluding host): api_path = "/xxx/xxx?xx=xxx"
Calculation Steps:
1. Convert api_path to lowercase: lower_api_path
2. Convert the request method to lowercase: lower_method (e.g., "delete"/"post"/...)
3. Convert data to a JSON string: sort_json_str; Taking Python as an example: json.dumps(dict(data), sort_keys=True).replace(' ', '')
4. Concatenate strings in the following order: lower_api_path + lower_method + sort_json_str + secret + X-TIMESTAMP (X-TIMESTAMP: Interface second-level timestamp, valid within 60s from the current time; Resulting sign example: /xxx/xxx?xx=xxx{"xxxx":"xxxx"}iamsecret1489133053)
5. Encode the sign in utf8 and calculate MD5 to get X-TOKEN: ddc6457fd0b373475ac65912b797ef05
1.2 Interface Call
The following header information should be added when requesting the interface:
X-APP-ID: Application AK
X-TIMESTAMP: Second-level timestamp
X-TOKEN: Signature calculation result
1.3 Demo Code
import time
import json
import hashlib
import requests
from urllib.parse import urljoin
def encode_with_md5(s):
m = hashlib.md5()
m.update(s.encode('utf-8'))
return m.hexdigest()
def headers_need_sign(ak, secret, method, url, data):
headers = {}
t = int(time.time())
# Ensure data is sorted and spaces are removed
data = json.dumps(dict(data), sort_keys=True).replace(' ', '')
ori_sign = '{0}{1}{2}{3}{4}'.format(url.lower(), method.lower(), data, secret, t)
sign = encode_with_md5(ori_sign)
headers["X-APP-ID"] = ak
headers["X-TOKEN"] = sign
headers["X-TIMESTAMP"] = str(t)
return headers
if __name__ == '__main__':
ak = '37514ac0-3fce-4f4c-bc3f-86eba37da7dd'
secret = 'bb81b706-ef1f-443e-9e86-9df8399f796b'
method = 'POST'
host = 'https://nebula-agent.xingyun3d.com'
url = '/xxx/xxx?x=xx&z=zz'
req_data = {
"data1": "data1",
"data2": "data2",
}
# Calculate and get request headers
req_headers = headers_need_sign(ak, secret, method, url, req_data)
# Request interface
req_url = urljoin(host, url)
resp = requests.request(method, req_url, json=req_data, headers=req_headers)
Call Description
2.1 Websocket Voice Trial
Host: https://nebula-agent.xingyun3d.com
Request Path: wss://{host}/user/v1/ws/tts
Request Parameters (Params)
Parameter Name | Type | Name | Required | Remarks |
tts_vcn | string | Voice Name | Yes | Meaning: Name of the voice used for synthesis |
text | text | Corpus | Yes | Meaning: Corpus content used for synthesizing audio |
Message
Parameter Name | Type | Name | Required | Remarks |
text | text | Corpus | Yes | Meaning: Corpus content used for synthesizing audio |
Return Parameters
Level 1 Parameter | Type | Name | Remarks |
error_code | int | Error Code |
|
error_reason | string | Error Reason | |
data | string | Audio Data | |
start_time | float | Speech Start Seconds | |
end_time | float | Speech End Seconds | |
inference_end | bool | Is Finished |
2.2 Create Speech Synthesis Task
Host: https://nebula-agent.xingyun3d.com
Request Path: Post: /user/v1/tts_task/create_tts_task
Request Parameters
Parameter Name | Type | Name | Required | Remarks |
audio_name | string | Audio Name | No |
|
tts_vcn | string | Voice Name | Yes | Meaning: Name of the voice used for synthesis |
text | text | Corpus | Yes | Meaning: Corpus content used for synthesizing audio |
Return Parameters
Level 1 Parameter | Level 2 Parameter | Type | Name | Remarks |
error_code | int | Error Code |
| |
error_reason | string | Error Reason | ||
data | dict | |||
task_id | int | Task ID |
2.3 Query Task Result
Host: https://nebula-agent.xingyun3d.com
Request Path: GET: /user/v1/tts_task/get_tts_task
Request Parameters
Parameter Name | Type | Name | Required | Remarks |
task_id | int | Task ID | Yes |
Return Parameters
Level 1 Parameter | Level 2 Parameter | Type | Name | Remarks |
error_code | int | Error Code |
| |
error_reason | string | Error Reason | ||
data | dict | |||
task_id | int | Task ID | ||
synth_status | string | Task Status |
| |
file_oss | string | File Path | This field has a value only when the task status is "finished" | |
synth_start_time | datetime | Synthesis Start Time | ||
synth_finish_time | datetime | Synthesis Finish Time | This field has a value only when the task status is "finished" | |
error_reason | string | Error Log |
2.4 Cancel Task
Host: https://nebula-agent.xingyun3d.com
Request Path: Post: /user/v1/tts_task/cancel_tts_task
Request Parameters
Parameter Name | Type | Name | Required | Remarks |
task_id | int | Task ID | Yes |
Return Parameters
Level 1 Parameter | Level 2 Parameter | Type | Name | Remarks |
error_code | int | Error Code |
| |
error_reason | string | Error Reason |
Error Codes
Error Code | Error Description |
20001 | Application does not exist or cannot be used |
40001 | Trial listening error, please contact customer service |
40002 | Failed to create task, please restart or contact customer service for handling |
40003 | Speech synthesis task not found |
DEMO Examples
Websocket Voice Trial
wss://{host}/user/v1/ws/tts
Params: tts_vcn: XMOV_LV_TTS__13
Message:
{
"text": "This is a test data"
}
Messages:
{
"data_type": "CHAR_TIME_MAP",
"data": "[[\"This\", 0.0, 0.1349], [\"is\", 0.1349, 0.2383], [\"a\", 0.2383, 0.3133], [\"test\", 0.3133, 0.6611], [\"data\", 0.6611, 1.2096], [\"[PUNC]\", 1.2096, 1.725]]",
"start_time": 0.0,
"end_time": 0.0,
"sentence_index": -1,
"char_index": -1,
"inference_end": false,
"flush_buffer": false,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
"data_type": "AUDIO",
"data": "AAAA",
"start_time": 0.0,
"end_time": 1.725000023841858,
"sentence_index": -1,
"char_index": -1,
"inference_end": false,
"flush_buffer": false,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
"data_type": "CHAR_TIME_MAP",
"data": "",
"start_time": 0.0,
"end_time": 0.0,
"sentence_index": -1,
"char_index": -1,
"inference_end": false,
"flush_buffer": true,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
"data_type": "AUDIO",
"data": "",
"start_time": 0.0,
"end_time": 0.0,
"sentence_index": -1,
"char_index": -1,
"inference_end": true,
"flush_buffer": false,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
Create Speech Synthesis Task
Post: /user/v1/tts_task/create_tts_task
Body:
{
"text": "I would like to remind everyone here to go to bed early and get up late after winter comes, and ensure adequate sleep. You can exercise appropriately, just until your body feels slightly warm and sweats a little; do not exercise excessively. May you be warm at heart, healthy in body, and happy in family during this cold season.",
"tts_vcn": "XMOV_LV_TTS__13"
}
Response:
{
"error_code": 0,
"error_reason": "",
"data": {
"task_id": 10
}
}
Get Task Result
GET: /user/v1/tts_task/get_tts_task
Params: task_id: 135 # Task ID
Response:
{
"error_code": 0,
"error_reason": "",
"data": {
"id": 10,
"synth_status": "waiting",
"file_oss": "",
"synth_start_time": None,
"synth_finish_time": None,
"error_reason": "",
}
}
Cancel Task
Post: /user/v1/tts_task/cancel_tts_task
{
task_id: 10 # Task ID
}
Response:
{
"error_code": 0,
"error_reason": ""
}