API

Overview

The Xmov Embodia Virtual Human Video Generation function can convert text and PPT into high-quality virtual human videos, helping developers quickly build professional-level video generation capabilities.

Ⅰ API Documentation

1. Authentication

1.1 X-TOKEN Calculation

Interface received parameters:

a. Data body: data={"xxx":"xxx"} or {}

b. Secret key assigned to external parties: secret="iamsecret"

c. Interface method path (excluding host): api_path="/xxx/xxxxxx"

Calculation steps:

1. Convert api_path to all lowercase: lower_api_path

2. Convert the request method to lowercase: lower_method (e.g., "delete"/"post"/...)

3. Convert data to a JSON string: sort_json_str

Take Python as an example: json.dumps(dict(data), sort_keys=True).replace("'", "\"")

4. Concatenate strings in the following order: lower_api_path + lower_method + sort_json_str + secret + X-TIMESTAMP

a. X-TIMESTAMP: Interface second-level timestamp, valid within 60 seconds from the current time

b. Resulting sign: "/xxx/xxxxxx" + "post" + "{\"xxx\":\"xxx\"}" + "iamsecret" + "1489133053"

5. Encode the sign in UTF-8 and calculate MD5 to get X-TOKEN: ddc6457fd0b373475ac65912b797ef05

1.2 Interface Call

The following header information should be added when requesting the interface:

X-APP-ID: Application AK
X-TIMESTAMP: Second-level timestamp
X-TOKEN: Signature calculation result

1.3 Demo Code

import time
import json
import hashlib
import requests
from urllib.parse import urljoin

def encode_with_md5(s):
m = hashlib.md5()
m.update(s.encode('utf-8'))
return m.hexdigest()

def headers_need_sign(ak, secret, method, url, data):
headers = {}
t = int(time.time())
data_str = json.dumps(dict(data), sort_keys=True).replace("'", "\"")
ori_sign = "{}{}{}{}{}".format(url.lower(), method.lower(), data_str, secret, t)
sign = encode_with_md5(ori_sign)
headers["X-APP-ID"] = ak
headers["X-TOKEN"] = sign
headers["X-TIMESTAMP"] = str(t)
return headers

if __name__ == '__main__':
ak = "37514ac-3fce-4f4c-bc3f-86eba37da7dd"
secret = 'bb81b786-ef1f-443e-9e86-9df8399f796b'
method = 'POST'
host = 'https://nebula-agent.xingyun3d.com'
url = '/xxx/xxx?x=xx&z=22'
req_data = {
"data1": "data1",
"data2": "data2"
}
# Calculate and get request headers
req_headers = headers_need_sign(ak, secret, method, url, req_data)
# Request the interface
req_url = urljoin(host, url)
resp = requests.request(method, req_url, json=req_data, headers=req_headers)




2. API Call

2.1 Initiate Rendering

2.1.1 Method 1: Initiate rendering via segment

Host: https://nebula-agent.xingyun3d.com

Request path: Post: /user/v1/video_synthesis_task/create_render_task

Request Parameters

Type

Parameter Name

Chinese Name

Mandatory

Remarks

string

video_name

Video Name

No

  • Output video name = download file name
  • Length limit (Chinese: 24 characters; English: 50 characters)
  • Optional item, default name generated by timestamp

string

look_name

Avatar ID

Yes

ID of the avatar used in the video

string

tts_vcn_name

Voice ID

Yes

ID of the voice color used in the video

string

studio_name

Studio ID

Yes

ID of the studio used in the video

string

sub_title

Enable Subtitles

No

  • Enumerated values: on/off
  • Optional item, default is on

JSON Array

segment

SSML Script

No


string

output_resolution

Video Resolution

No

  • Enumerated values: 540P; 720P; 1080P; 2K; 4K
  • Optional item, default is 720P

bool

if_aigc_mark

AI Generation Mark

No

  • Enumerated values:
  • true (display "Xmov Embodia · AI Generated" in the
  • lower right corner of the video)
  • false (remove AI mark from the video)
  • Optional item, default is true

string

video_format

Video Format

No

  • Supported formats: mp4, mov, webm, mkv
  • Default is mp4
  • Format support by resolution:
  • 540P: mp4
  • 720P: mp4
  • 1080P and above: mp4, mov, webm, mkv
Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Chinese Name

Remarks

-

error_code

int

Error Code

0: Success; Others: Error

-

error_reason

string

Error Reason

-

-

data

dict

Data

-


Use the segment Tag for Phonetic Notation and Pauses

Phonetic Notation Examples

Single-character Phonetic Notation Illustration

<phoneme contenteditable="false" data-text="认" py="ʐʅn4"></phoneme>

Multi-character Phonetic Notation Illustration

<phoneme contenteditable="false" data-text="你们" py="nɪn3 haʊ3">你们</phoneme>

Pause Example

<break time="1000ms"></break>

Complete Example

[
{
"text": "Dear audience friends, hello everyone.<phoneme contenteditable=\"false\" data-text=\"认\" py=\"ʐʅn4\">认</phoneme>真收看以下内容哦。欢迎收看大众电视台社会民生栏目,我是主播小朱,现在为您带来最新的民生资讯。",
"media": "https://media.yoyan.yyz/yoyan/user_upload.prod/9957_5c339a8b1914973a542c.png"
},
{
"text": "The above is today's livelihood news report.<break time=\"1000ms\"></break> We will continue to follow social livelihood dynamics and bring you the latest information. Thank you for watching. See you next time!",
"media": "https://media.yoyan.yyz/yoyan/user_upload.prod/9957_5c339a8b1914973a542c.png"
}
]

2.1.2 Method 2: Initiate Rendering via PPT

First call the PPT parsing interface, then call the create rendering task interface.

(1) PPT Parsing Interface

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/video_synthesis_task/parse_ppt_file

Request Parameters

Parameter Name

Type

Name

Mandatory

Remarks

ppt_file

binary

PPT File

Mandatory

This parameter is not included in X-TOKEN calculation

Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Name

Remarks

error_code

-

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

-

string

Error Reason

-

data

-

dict

-

-


parse_ppt_file_name

string

PPT File Parsing Name

-

(2) Create Rendering Task Interface

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/video_synthesis_task/create_render_task

Request Parameters

Parameter Name

Type

Name

Mandatory

Remarks

video_name

string

Video Name

Optional

  • Meaning: Output video name = download file name;
  • Default: timestamp-generated name

look_name

string

Avatar ID

Mandatory

Meaning: Avatar ID used in the video

tts_vcn_name

string

Voice ID

Mandatory

Meaning: Voice ID used in the video

studio_name

string

Studio ID

Mandatory

Meaning: Studio ID used in the video

sub_title

string

Enable Subtitles

Optional

  • Meaning: Whether to enable subtitles;
  • Enumerated values: on/off; Default: on

parse_ppt_file_name

string

PPT File Parsing Name

Mandatory

Obtained via the PPT parsing interface

output_resolution

string

Video Resolution

Optional

  • Meaning: Video resolution;
  • Enumerated values: 540P; 720P; 1080P; 2K; 4K;
  • Default: 720P

if_aigc_mark

bool

AI Generation Mark

Optional

  • Meaning: Whether to add AI generation mark;
  • Enumerated values: true (displays "Xmov Embodia · AI Generated" in the video's lower right corner) / false (removes the AI mark);
  • Default: true

Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Name

Remarks

error_code

-

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

-

string

Error Reason

-

data

-

dict

-

-


task_id

int

Video Task ID

-



2.2 Query Rendering Result

Host: https://nebula-agent.xingyun3d.com

Request Path: GET: /user/v1/video_synthesis_task/get_render_task

Request Parameters

Parameter Name

Type

Name

Mandatory

Remarks

task_id

int

Video Task ID

Mandatory

-

Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Name

Remarks

error_code

-

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

-

string

Error Reason

-

data

-

dict

-

-


task_id

int

Video Task ID

-


synth_state

string

Task Status

  • Enumerated values:
  • not_send: Pending
  • waiting: Processing
  • processing: In Progress
  • finished: Completed
  • error: Synthesis Failed
  • cancel: Synthesis Cancelled


render_image_oss

string

Rendered Image OSS

  • Meaning: OSS link of the rendered image
  • Valid only when the cloud task status is successful


render_video_oss

string

Rendered Video OSS

  • Meaning: OSS link of the rendered video
  • Valid only when the cloud task status is successful


amount

float

Points Consumed

-


synth_start_time

datetime

Synthesis Start Time

Video synthesis start time


synth_finish_time

datetime

Synthesis Completion Time

  • Meaning: Video synthesis completion time
  • Valid only when the cloud task status is successful


error_reason

string

Error Log

Meaning: Log info recorded after cloud task creation, e.g.:
a. PPT file parsing stuck
b. Synthesis task creation failed
c. Limit check failed
d. Voice task creation failed



2.3 Cancel Rendering Task

Host: https://nebula-agent.xingyun3d.com

Request Path: Post: /user/v1/video_synthesis_task/cancel_render_task

Request Parameters

Parameter Name

Type

Name

Mandatory

Remarks

task_id

int

Video Task ID

Mandatory

-

Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Name

Remarks

error_code

-

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

-

string

Error Reason

-



2.4 Preview

Host: https://nebula-agent.xingyun3d.com

Request Path: GET: /user/v1/video_synthesis_task/get_render_task_preview_url

Request Parameters

Parameter Name

Type

Name

Mandatory

Remarks

task_id

int

Video Task ID

Mandatory

-

Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Name

Remarks

error_code

-

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

-

string

Error Reason

-

data

-

dict

-

-


preview_url

string

Preview URL

-




3 Error Codes

Error Code

Description

20001

Application does not exist or is unavailable

30002

PPT file does not exist

30003

PPT file parsing error

30004

Video task not found

30005

Video task creation error

30006

Video task cancellation error

Ⅱ DEMO Example

1. Initiate Rendering

1.1 Method Ⅰ: Initiate Rendering via segment


Post: /user/v1/video_synthesis_task/create_render_task
Body:
{
"look_name": "WWJS_4p_9021_new",
"tts_vcn_name": "XMOV_HN_TTS__236",
"studio_name": "youling_2d_v",
"segment": [
{
"text": "Promote high-quality development of regional industries, advance proper application of digital intelligence technology, and enhance friendly exchanges and cooperation among enterprises.",
"media_url": "https://media.xmov.ai/youyan/user_upload_qa/74275_4b733edeca68484e8940da.png"
},
{
"text": "The 14th Five-Year Plan focuses on high-quality development, integrates multiple industrial resources, gathers development momentum, and will inject new impetus into regional industrial development in 2024.",
"media_url": "https://media.xmov.ai/youyan/user_upload_qa/74275_1ca7a4a5407c43acb0da8b.mp4"
},
{
"text": "Meanwhile, we will conduct multi-dimensional, multi-level, multi-field discussions, inviting entrepreneurs, well-known enterprises, and industry leaders to share the latest development trends and opportunities of the industry.",
"media_url": "https://media.yoyan.yyz/yoyan/user_upload/prod/74275_1ca7a5647d3acb.mp4"
}
]
// Either segment or PPT file is required
//
"ppt_file": "example_ppt.pptx" * binary
}

Response:
{
"error_code": 0,
"error_reason": "",
"data": {
"task_id": 135
}
}


1.2 Method Ⅱ: Initiate Rendering via PPT

# coding=utf-8
import hashlib
import json
import time
import requests

def generate_x_token(data, secret, api_path, method, timestamp=None):
# Convert api_path to lowercase
lower_api_path = api_path.lower()
# Convert request method to lowercase
lower_method = method.lower()
# Convert data to a sorted JSON string
sort_json_str = json.dumps(dict(data), sort_keys=True).replace("'", "\"")
# Use the passed timestamp or get current timestamp
x_timestamp = str(int(timestamp) if timestamp else int(time.time()))
# Concatenate strings for signature calculation
sign_str = f"{lower_api_path}{lower_method}{sort_json_str}{secret}{x_timestamp}"
# Generate MD5 signature and return
return hashlib.md5(sign_str.encode('utf-8')).hexdigest()

# Replace with your application's App ID
app_id = 'xxxxx'
# Replace with your application's App Secret
secret = 'xxxxx'
host = 'https://nebula-agent.xingyun3d.com'

# PPT Parsing Interface
files = [
('ppt_file', ('PPT模板.pptx', open('PPT模板.pptx', 'rb'), 'application/vnd.openxmlformats-officedocument.presentationml.presentation'))
]
data = {}
api_path = '/user/v1/video_synthesis_task/parse_ppt_file'
method = 'POST'
timestamp = int(time.time())

headers = {}
headers["X-APP-ID"] = app_id
# Generate X-TOKEN
headers["X-TOKEN"] = generate_x_token(data, secret, api_path, method, timestamp)
headers["X-TIMESTAMP"] = str(timestamp)

# Send PPT parsing request
resp = requests.request(method, f"{host}{api_path}", data=data, headers=headers, files=files, timeout=30)
res = resp.json()
print(res)
# Get parsed PPT file name from response
parse_ppt_file_name = res.get('data').get('parse_ppt_file_name')

# Create Render Task Interface
data = {
"look_name": "caishengwei_3663_new",
"tts_vcn_name": "XMW_FM_TTS_13",
"if_aigc_mark": True,
"studio_name": "telestudio_simple_red_01",
"sub_title": "on",
"video_name": "Test Video Generated by API",
"parse_ppt_file_name": parse_ppt_file_name
}
api_path = '/user/v1/video_synthesis_task/create_render_task'
method = 'POST'
timestamp = int(time.time())

headers["Content-Type"] = "application/json"
headers["X-APP-ID"] = app_id
headers["X-TOKEN"] = generate_x_token(data, secret, api_path, method, timestamp)
headers["X-TIMESTAMP"] = str(timestamp)

# Send create render task request
resp = requests.request(method, f"{host}{api_path}", json=data, headers=headers, timeout=30)
res = resp.json()
print(res)
# Get task ID from response
task_id = res.get('data').get('task_id')
data = {
"task_id": task_id
}




2. Get Video Result

GET: /user/v1/video_synthesis_task/get_render_task
Params: task_id: 155 # Task ID
Response:
{
"error_code": 0,
"error_reason": "",
"data": {
"task_id": 155,
"error_reason": "Synthesis task failed, please re-initiate or contact customer service for handling",
"create_time": "2025-07-17T11:30:08.637104800",
"update_time": "2025-07-17T11:30:54.131848080",
"enable": true,
"name": "b5b7f20c0c64e72f9fda59a4a392667",
"video_name": "20250717_11_30_07.833",
"output_resolution": "540P",
"look_name": "AM058_19518_new",
"tts_vcn_name": "XMW_HM_TTS_6",
"studio_name": "bust_chic_an_museum_01",
"sub_title": "on",
"synth_start_time": null,
"synth_finish_time": null,
"synth_state": "error",
"segment": [
{
"text": "This is a piece of test data.",
"media_id": 12338,
"media_url": "https://media.xmov.ai/yoyuan/user_upload/qa/171998_b37eb4358c64bcf5fd4be.png"
},
{
"text": "Test segment 1",
"media_id": 12339,
"media_url": "https://media.xmov.ai/yoyuan/user_upload/qa/171998_b3195995e6d4cfa380.png"
},
{
"text": "Test segment 2",
"media_id": 12340,
"media_url": "https://media.xmov.ai/yoyuan/user_upload/qa/171998_3d1838496914b831ce6.png"
},
{
"text": "Test segment 3",
"media_id": 12341,
"media_url": "https://media.xmov.ai/yoyuan/user_upload/qa/171998_0cefd3514347e4d7ad18b1.png"
},
{
"text": "Test segment 4",
"media_id": 12342,
"media_url": "https://media.xmov.ai/yoyuan/user_upload/qa/171998_97acf1318a0489f5b6ae.png"
}
]
}
}

# Task ID
task_id: 155
Response:
{
"error_code": 0,
"error_reason": ""
}




3. Preview

  • Request Path: GET: /user/v1/video_synthesis_task/get_render_task_preview_url

Request Parameters

Parameter Name

Type

Name

Mandatory

Remarks

task_id

int

Video Task ID

Mandatory

-

Return Parameters

First-level Parameter Name

Second-level Parameter Name

Type

Name

Remarks

error_code

-

int

Error Code

  • 0: Success;
  • Others: Error

error_reason

-

string

Error Reason

-

data

-

dict

-

-


preview_url

string

Preview URL

-