TTS API

Overview

Converts text into natural speech in real-time, providing high-quality sound output for virtual humans and various intelligent terminal applications.

Ⅰ API Description

1 Authentication

1.1 X-TOKEN Calculation

Interface Received Parameters

1. Data Body: data = {"xxxx": "xxxx"} or {}

2. Key assigned to external users: secret = "iamsecret"

3. Interface Method Path (excluding host): api_path = "/xxx/xxx?xx=xxx"

Calculation Steps:

1. Convert api_path to lowercase: lower_api_path

2. Convert the request method to lowercase: lower_method (e.g., "delete"/"post"/...)

3. Convert data to a JSON string: sort_json_str; Taking Python as an example: json.dumps(dict(data), sort_keys=True).replace(' ', '')

4. Concatenate strings in the following order: lower_api_path + lower_method + sort_json_str + secret + X-TIMESTAMP (X-TIMESTAMP: Interface second-level timestamp, valid within 60s from the current time; Resulting sign example: /xxx/xxx?xx=xxx{"xxxx":"xxxx"}iamsecret1489133053)

5. Encode the sign in utf8 and calculate MD5 to get X-TOKEN: ddc6457fd0b373475ac65912b797ef05

1.2 Interface Call

The following header information should be added when requesting the interface:

X-APP-ID: Application AK

X-TIMESTAMP: Second-level timestamp

X-TOKEN: Signature calculation result

1.3 Demo Code

import time
import json
import hashlib
import requests
from urllib.parse import urljoin


def encode_with_md5(s):
m = hashlib.md5()
m.update(s.encode('utf-8'))
return m.hexdigest()


def headers_need_sign(ak, secret, method, url, data):
headers = {}
t = int(time.time())

# Ensure data is sorted and spaces are removed
data = json.dumps(dict(data), sort_keys=True).replace(' ', '')
ori_sign = '{0}{1}{2}{3}{4}'.format(url.lower(), method.lower(), data, secret, t)
sign = encode_with_md5(ori_sign)
headers["X-APP-ID"] = ak
headers["X-TOKEN"] = sign
headers["X-TIMESTAMP"] = str(t)
return headers


if __name__ == '__main__':
ak = '37514ac0-3fce-4f4c-bc3f-86eba37da7dd'
secret = 'bb81b706-ef1f-443e-9e86-9df8399f796b'
method = 'POST'
host = 'https://nebula-agent.xingyun3d.com'
url = '/xxx/xxx?x=xx&z=zz'
req_data = {
"data1": "data1",
"data2": "data2",
}

# Calculate and get request headers
req_headers = headers_need_sign(ak, secret, method, url, req_data)

# Request interface
req_url = urljoin(host, url)
resp = requests.request(method, req_url, json=req_data, headers=req_headers)

Call Description

2.1 Websocket Voice Trial

Host: https://nebula-agent.xingyun3d.com

Request Path: wss://{host}/user/v1/ws/tts

Request Parameters (Params)

Parameter Name

Type

Name

Required

Remarks

tts_vcn

string

Voice Name

Yes

Meaning: Name of the voice used for synthesis

text

text

Corpus

Yes

Meaning: Corpus content used for synthesizing audio

Message

Parameter Name

Type

Name

Required

Remarks

text

text

Corpus

Yes

Meaning: Corpus content used for synthesizing audio

Return Parameters

Level 1 Parameter

Type

Name

Remarks

error_code

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

string

Error Reason


data

string

Audio Data


start_time

float

Speech Start Seconds


end_time

float

Speech End Seconds


inference_end

bool

Is Finished


2.2 Create Speech Synthesis Task

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/tts_task/create_tts_task

Request Parameters

Parameter Name

Type

Name

Required

Remarks

audio_name

string

Audio Name

No

  • Meaning: Output audio name = download filename
  • Optional
  • Audio name generation rule is based on time

tts_vcn

string

Voice Name

Yes

Meaning: Name of the voice used for synthesis

text

text

Corpus

Yes

Meaning: Corpus content used for synthesizing audio

Return Parameters

Level 1 Parameter

Level 2 Parameter

Type

Name

Remarks

error_code


int

Error Code

  • 0: Success;
  • Others: Error

error_reason


string

Error Reason


data


dict




task_id

int

Task ID


2.3 Query Task Result

Host: https://nebula-agent.xingyun3d.com

Request Path: GET: /user/v1/tts_task/get_tts_task

Request Parameters

Parameter Name

Type

Name

Required

Remarks

task_id

int

Task ID

Yes


Return Parameters

Level 1 Parameter

Level 2 Parameter

Type

Name

Remarks

error_code


int

Error Code

  • 0: Success;
  • Others: Error

error_reason


string

Error Reason


data


dict




task_id

int

Task ID



synth_status

string

Task Status

  • not_send: Queuing
  • processing: Synthesizing
  • finished: Completed
  • error: Failed
  • canceled: Canceled


file_oss

string

File Path

This field has a value only when the task status is "finished"


synth_start_time

datetime

Synthesis Start Time



synth_finish_time

datetime

Synthesis Finish Time

This field has a value only when the task status is "finished"


error_reason

string

Error Log


2.4 Cancel Task

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/tts_task/cancel_tts_task

Request Parameters

Parameter Name

Type

Name

Required

Remarks

task_id

int

Task ID

Yes


Return Parameters

Level 1 Parameter

Level 2 Parameter

Type

Name

Remarks

error_code


int

Error Code

  • 0: Success;
  • Others: Error

error_reason


string

Error Reason


Error Codes

Error Code

Error Description

20001

Application does not exist or cannot be used

40001

Trial listening error, please contact customer service

40002

Failed to create task, please restart or contact customer service for handling

40003

Speech synthesis task not found

DEMO Examples



Websocket Voice Trial

wss://{host}/user/v1/ws/tts
Params: tts_vcn: XMOV_LV_TTS__13
Message:
{
"text": "This is a test data"
}

Messages:
{
"data_type": "CHAR_TIME_MAP",
"data": "[[\"This\", 0.0, 0.1349], [\"is\", 0.1349, 0.2383], [\"a\", 0.2383, 0.3133], [\"test\", 0.3133, 0.6611], [\"data\", 0.6611, 1.2096], [\"[PUNC]\", 1.2096, 1.725]]",
"start_time": 0.0,
"end_time": 0.0,
"sentence_index": -1,
"char_index": -1,
"inference_end": false,
"flush_buffer": false,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
"data_type": "AUDIO",
"data": "AAAA",
"start_time": 0.0,
"end_time": 1.725000023841858,
"sentence_index": -1,
"char_index": -1,
"inference_end": false,
"flush_buffer": false,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
"data_type": "CHAR_TIME_MAP",
"data": "",
"start_time": 0.0,
"end_time": 0.0,
"sentence_index": -1,
"char_index": -1,
"inference_end": false,
"flush_buffer": true,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}
{
"data_type": "AUDIO",
"data": "",
"start_time": 0.0,
"end_time": 0.0,
"sentence_index": -1,
"char_index": -1,
"inference_end": true,
"flush_buffer": false,
"req_id": "ws-tts-564c57d3-6c50-4c01-90a4-4016547f6625-0"
}

Create Speech Synthesis Task

Post: /user/v1/tts_task/create_tts_task
Body:
{
"text": "I would like to remind everyone here to go to bed early and get up late after winter comes, and ensure adequate sleep. You can exercise appropriately, just until your body feels slightly warm and sweats a little; do not exercise excessively. May you be warm at heart, healthy in body, and happy in family during this cold season.",
"tts_vcn": "XMOV_LV_TTS__13"
}
Response:
{
"error_code": 0,
"error_reason": "",
"data": {
"task_id": 10
}
}

Get Task Result

GET: /user/v1/tts_task/get_tts_task
Params: task_id: 135 # Task ID
Response:
{
"error_code": 0,
"error_reason": "",
"data": {
"id": 10,
"synth_status": "waiting",
"file_oss": "",
"synth_start_time": None,
"synth_finish_time": None,
"error_reason": "",
}
}

Cancel Task

Post: /user/v1/tts_task/cancel_tts_task
{
task_id: 10 # Task ID
}
Response:
{
"error_code": 0,
"error_reason": ""
}