TTS API

Overview

Converts text into natural speech in real-time, providing high-quality sound output for virtual humans and various intelligent terminal applications.

Ⅰ API Description

1 Authentication

1.1 X-TOKEN Calculation

Interface Received Parameters

1. Data Body: data = {"xxxx": "xxxx"} or {}

2. Key assigned to external users: secret = "iamsecret"

3. Interface Method Path (excluding host): api_path = "/xxx/xxx?xx=xxx"

Calculation Steps:

1. Convert api_path to lowercase: lower_api_path

2. Convert the request method to lowercase: lower_method (e.g., "delete"/"post"/...)

3. Convert data to a JSON string: sort_json_str; Taking Python as an example: json.dumps(dict(data), sort_keys=True).replace(' ', '')

4. Concatenate strings in the following order: lower_api_path + lower_method + sort_json_str + secret + X-TIMESTAMP (X-TIMESTAMP: Interface second-level timestamp, valid within 60s from the current time; Resulting sign example: /xxx/xxx?xx=xxx{"xxxx":"xxxx"}iamsecret1489133053)

5. Encode the sign in utf8 and calculate MD5 to get X-TOKEN: ddc6457fd0b373475ac65912b797ef05

1.2 Interface Call

The following header information should be added when requesting the interface:

X-APP-ID: Application AK

X-TIMESTAMP: Second-level timestamp

X-TOKEN: Signature calculation result

1.3 Demo Code

import time
import json
import hashlib
import requests
from urllib.parse import urljoin


def encode_with_md5(s):
    m = hashlib.md5()
    m.update(s.encode('utf-8'))
    return m.hexdigest()


def headers_need_sign(ak, secret, method, url, data):
    headers = {}
    t = int(time.time())

    # Ensure data is sorted and spaces are removed
    data = json.dumps(dict(data), sort_keys=True).replace(' ', '')
    ori_sign = '{0}{1}{2}{3}{4}'.format(url.lower(), method.lower(), data, secret, t)
    sign = encode_with_md5(ori_sign)
    headers["X-APP-ID"] = ak
    headers["X-TOKEN"] = sign
    headers["X-TIMESTAMP"] = str(t)
    return headers


if __name__ == '__main__':
    ak = '37514ac0-3fce-4f4c-bc3f-86eba37da7dd'
    secret = 'bb81b706-ef1f-443e-9e86-9df8399f796b'
    method = 'POST'
    host = 'https://nebula-agent.xingyun3d.com'
    url = '/xxx/xxx?x=xx&z=zz'
    req_data = {
        "data1": "data1",
        "data2": "data2",
    }

    # Calculate and get request headers
    req_headers = headers_need_sign(ak, secret, method, url, req_data)

    # Request interface
    req_url = urljoin(host, url)
    resp = requests.request(method, req_url, json=req_data, headers=req_headers)

Call Description

2.1 Websocket Voice Trial

Host: https://nebula-agent.xingyun3d.com

Request Path: wss://{host}/user/v1/ws/tts

Request Parameters (Params)

Parameter Name	Type	Name	Required	Remarks
tts_vcn	string	Voice Name	Yes	Meaning: Name of the voice used for synthesis
text	text	Corpus	Yes	Meaning: Corpus content used for synthesizing audio

Message

Parameter Name	Type	Name	Required	Remarks
text	text	Corpus	Yes	Meaning: Corpus content used for synthesizing audio

Return Parameters

Level 1 Parameter	Type	Name	Remarks
error_code	int	Error Code	0: Success; Others: Error
error_reason	string	Error Reason
data	string	Audio Data
start_time	float	Speech Start Seconds
end_time	float	Speech End Seconds
inference_end	bool	Is Finished

2.2 Create Speech Synthesis Task

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/tts_task/create_tts_task

Request Parameters

Parameter Name	Type	Name	Required	Remarks
audio_name	string	Audio Name	No	Meaning: Output audio name = download filename Optional Audio name generation rule is based on time
tts_vcn	string	Voice Name	Yes	Meaning: Name of the voice used for synthesis
text	text	Corpus	Yes	Meaning: Corpus content used for synthesizing audio

Return Parameters

Level 1 Parameter	Level 2 Parameter	Type	Name	Remarks
error_code		int	Error Code	0: Success; Others: Error
error_reason		string	Error Reason
data		dict
	task_id	int	Task ID

2.3 Query Task Result

Host: https://nebula-agent.xingyun3d.com

Request Path: GET: /user/v1/tts_task/get_tts_task

Request Parameters

Parameter Name	Type	Name	Required	Remarks
task_id	int	Task ID	Yes

Return Parameters

Level 1 Parameter	Level 2 Parameter	Type	Name	Remarks
error_code		int	Error Code	0: Success; Others: Error
error_reason		string	Error Reason
data		dict
	task_id	int	Task ID
	synth_status	string	Task Status	not_send: Queuing processing: Synthesizing finished: Completed error: Failed canceled: Canceled
	file_oss	string	File Path	This field has a value only when the task status is "finished"
	synth_start_time	datetime	Synthesis Start Time
	synth_finish_time	datetime	Synthesis Finish Time	This field has a value only when the task status is "finished"
	error_reason	string	Error Log

2.4 Cancel Task

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/tts_task/cancel_tts_task

Request Parameters

Parameter Name	Type	Name	Required	Remarks
task_id	int	Task ID	Yes

Return Parameters

Level 1 Parameter	Level 2 Parameter	Type	Name	Remarks
error_code		int	Error Code	0: Success; Others: Error
error_reason		string	Error Reason

Error Codes

Error Code	Error Description
20001	Application does not exist or cannot be used
40001	Trial listening error, please contact customer service
40002	Failed to create task, please restart or contact customer service for handling
40003	Speech synthesis task not found

DEMO Examples

Websocket Voice Trial

wss://{host}/user/v1/ws/tts
Params: tts_vcn: XMOV_LV_TTS__13 
Message: 
{
 "text": "This is a test data"
}
 
Messages:
{
 "data_type": "CHAR_TIME_MAP",
 "data": "[[\"This\", 0.0, 0.1349], [\"is\", 0.1349, 0.2383], [\"a\", 0.2383, 0.3133], [\"test\", 0.3133, 0.6611], [\"data\", 0.6611, 1.2096], [\"[PUNC]\", 1.2096, 1.725]]",
 "start_time": 0.0,
 "end_time": 0.0,
 "sentence_index": -1,
 "char_index": -1,
 "inference_end": false,
 "flush_buffer": false,
 "req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
 "data_type": "AUDIO",
 "data": "AAAA",
 "start_time": 0.0,
 "end_time": 1.725000023841858,
 "sentence_index": -1,
 "char_index": -1,
 "inference_end": false,
 "flush_buffer": false,
 "req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
 "data_type": "CHAR_TIME_MAP",
 "data": "",
 "start_time": 0.0,
 "end_time": 0.0,
 "sentence_index": -1,
 "char_index": -1,
 "inference_end": false,
 "flush_buffer": true,
 "req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
 "data_type": "AUDIO",
 "data": "",
 "start_time": 0.0,
 "end_time": 0.0,
 "sentence_index": -1,
 "char_index": -1,
 "inference_end": true,
 "flush_buffer": false,
 "req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}

Create Speech Synthesis Task

Post: /user/v1/tts_task/create_tts_task
Body: 
{
 "text": "I would like to remind everyone here to go to bed early and get up late after winter comes, and ensure adequate sleep. You can exercise appropriately, just until your body feels slightly warm and sweats a little; do not exercise excessively. May you be warm at heart, healthy in body, and happy in family during this cold season.",
 "tts_vcn": "XMOV_LV_TTS__13"
}
Response:
{
 "error_code": 0,
 "error_reason": "",
 "data": {
 "task_id": 10
 }
}

Get Task Result

GET: /user/v1/tts_task/get_tts_task
Params: task_id: 135 # Task ID
Response:
{
 "error_code": 0,
 "error_reason": "",
 "data": {
 "id": 10,
 "synth_status": "waiting",
 "file_oss": "",
 "synth_start_time": None,
 "synth_finish_time": None,
 "error_reason": "",
 }
}

Cancel Task

Post: /user/v1/tts_task/cancel_tts_task
{
 task_id: 10 # Task ID
} 
Response:
{
 "error_code": 0,
 "error_reason": ""
}

Overview

Ⅰ API Description

1 Authentication

1.1 X-TOKEN Calculation

1.2 Interface Call

1.3 Demo Code

Call Description

2.1 Websocket Voice Trial

2.2 Create Speech Synthesis Task

2.3 Query Task Result

2.4 Cancel Task

Error Codes

DEMO Examples

Websocket Voice Trial

Create Speech Synthesis Task

Get Task Result

Cancel Task

Embodia AI — Beyond Digital Humans， Empowering AI to think, express, and truly engage.

Embodia AI — Beyond Digital Humans，
Empowering AI to think, express, and truly engage.