AI电影解说视频自动生成技能(AI解说大师 CLI Skill)。当用户需要创建电影解说视频、短剧解说、影视二创、AI配音旁白视频、film commentary、video narration、drama dubbing、movie narration时触发。内置93部电影素材、146首BGM、63种配音音色(11种语言)、90+解说模板。通过narrator-ai-cli命令行工具实现:搜片选片→选择模板→选BGM→选配音→生成文案→合成视频的全流程自动化。CLI client for Narrator AI (AI解说大师) video narration API. Use when user needs to create AI narration videos, manage narration tasks, browse dubbing/BGM/material resources, or automate video production.
CLI client for Narrator AI video narration API. Designed for AI Agents and developers.
CLI Repo: https://github.com/GridLtd-ProductDev/narrator-ai-cli Resources Preview: https://ceex7z9m67.feishu.cn/wiki/WLPnwBysairenFkZDbicZOfKnbc
# From GitHub release (recommended — pinned to a specific version)
pip install "narrator-ai-cli @ https://github.com/GridLtd-ProductDev/narrator-ai-cli/archive/refs/tags/v1.0.0.zip"
# Or from GitHub latest (tracks main branch)
pip install "narrator-ai-cli @ git+https://github.com/GridLtd-ProductDev/narrator-ai-cli.git"
# Or clone + editable install
git clone https://github.com/GridLtd-ProductDev/narrator-ai-cli.git
cd narrator-ai-cli && pip install -e .
Requires Python 3.10+. Dependencies: typer, httpx[socks], httpx-sse, pyyaml, rich.
# Interactive setup (server URL + API key)
narrator-ai-cli config init
# Or set directly
narrator-ai-cli config set app_key <your_app_key>
# No API key yet? Contact support: WeChat `gezimufeng` or email [email protected]
# Verify
narrator-ai-cli config show
narrator-ai-cli user balance
Config stored at ~/.narrator-ai/config.yaml (permissions 0600).
Server defaults to https://openapi.jieshuo.cn.
Environment variable overrides (take precedence over config file):
| Variable | Description | Default |
|---|---|---|
NARRATOR_SERVER | API server URL | https://openapi.jieshuo.cn |
NARRATOR_APP_KEY | API key | (from config) |
NARRATOR_TIMEOUT | Request timeout in seconds | 30 |
src/narrator_ai/
├── cli.py # Typer main entry point, 7 sub-command groups
├── client.py # httpx client: GET/POST/DELETE/SSE/upload, auto auth via app-key header
├── config.py # YAML config (~/.narrator-ai/config.yaml), env var override
├── output.py # Rich table + JSON dual output (--json flag)
├── commands/
│ ├── config_cmd.py # config init/show/set
│ ├── user.py # balance/login/keys/create-key
│ ├── task.py # 9 task types, create/query/list/budget/verify/search-movie/narration-styles/templates/get-writing/save-writing/save-clip
│ ├── file.py # 3-step upload (presigned URL → OSS PUT → callback), download/list/info/storage/delete
│ ├── materials.py # 100+ pre-built movies (--page/--size pagination; no --genre/--search, filter locally)
│ ├── bgm.py # 146 BGM tracks (--search filter)
│ └── dubbing.py # 63 voices, 11 languages (--lang, --tag, --search filters)
└── models/
└── responses.py # API response codes (SUCCESS=10000, FAILED=10001, etc.) + task status constants
Key design choices:
--json for machine-readable output (always use when parsing programmatically)-d '{"key": "value"}' or -d @file.jsonapp-key header (not Bearer token)--stream)| Concept | Description |
|---|---|
| file_id | UUID for uploaded files. Via file upload or task results |
| task_id | UUID returned on task creation. Poll with task query |
| task_order_num | Assigned after task creation. Used as order_num for downstream tasks |
| file_ids | Output file IDs in completed task results. Input for next steps |
| learning_model_id | Narration style model. From popular-learning OR pre-built template (90+) |
| learning_srt | Reference SRT file_id. Only needed when NOT using learning_model_id |
material list (local search) → [file upload if not in materials] → popular-learning → generate-writing → clip-data → video-composing → magic-video(optional)
material list (local search) → [search-movie if not in materials] → fast-writing → fast-clip-data → video-composing → magic-video(optional)
⚠️ Agent behavior: Before starting, always ask the user which path to use — Standard (二创文案, adapted narration) or Fast (原创文案, recommended). Do not auto-select a path.
| Mode | Name | Required Input |
|---|---|---|
"1" | 热门影视 (纯解说) | confirmed_movie_json (from material data or search-movie); no episodes_data |
"2" | 原声混剪 (Original Mix) | confirmed_movie_json + episodes_data[{srt_oss_key, num}] required |
"3" | 冷门/新剧 (New Drama) | episodes_data[{srt_oss_key, num}] required; confirmed_movie_json optional |
All resource selection steps require user confirmation before proceeding. Follow these rules at every resource step:
Before creating any task, gather these resources first.
⚠️ Agent behavior: Use
material list --json --page 1 --size 100to fetch pre-built materials. Check thetotalfield in the response — iftotal > 100, fetch additional pages until all items are retrieved. Search programmatically usinggreporpython3 -cpiped from the JSON output — do NOT rely on the terminal display, which may be truncated and can miss items. Present all matching results (usually ≤ 3) to the user — show title, year, genre, and summary. Wait for the user to pick one before proceeding. If the user wants to upload their own files, guide them through thefile uploadflow for both video and SRT. Do NOT proceed to any writing step untilvideo_file_idandsrt_file_idare confirmed by the user.
# Option A: Pre-built materials (90+ movies, recommended)
narrator-ai-cli material list --json --page 1 --size 100
# If total > 100, fetch more pages: --page 2 --size 100, etc., until all items are retrieved
Response structure:
{
"total": 101,
"page": 1,
"size": 100,
"items": [
{
"id": "<material_id>",
"name": "极限职业",
"title": "Extreme Job",
"year": "2019",
"type": "喜剧片",
"story_info": "...",
"character_name": "[柳承龙 (Ryu Seung-ryong), 李荷妮 (Lee Ha-nee), ...]",
"cover": "https://...",
"video_file_id": "<video_file_id>",
"srt_file_id": "<srt_file_id>"
}
]
}
# Search programmatically (case-insensitive) — do NOT rely on truncated terminal output:
narrator-ai-cli material list --json --page 1 --size 100 | grep -i "飞驰人生"
narrator-ai-cli material list --json --page 1 --size 100 \
| python3 -c "import json, sys; items = json.load(sys.stdin).get('items', []); \
[print(json.dumps(i, ensure_ascii=False)) for i in items if '飞驰' in i.get('name','') or '飞驰' in i.get('title','')]"
Material → confirmed_movie_json field mapping (construct locally, no search-movie needed):
| Material field | confirmed_movie_json field | Notes |
|---|---|---|
name | local_title | Chinese title |
title | title | English title |
year | year | |
type | genre | e.g. 喜剧片 |
story_info | summary | |
character_name | stars | Parse JSON array string |
| (not in material) | director | Omit if unavailable |
# Option B: Upload your own
narrator-ai-cli file upload ./movie.mp4 --json # Returns file_id
narrator-ai-cli file upload ./subtitles.srt --json
narrator-ai-cli file list --json
narrator-ai-cli file transfer --link "<url>" --json # transfer by HTTP/Baidu/PikPak link
narrator-ai-cli file info <file_id> --json
narrator-ai-cli file download <file_id> --json
narrator-ai-cli file storage --json
narrator-ai-cli file delete <file_id> --json
Supported formats: .mp4, .mkv, .mov, .mp3, .m4a, .wav, .srt, .jpg, .jpeg, .png
⚠️ Agent behavior: Infer the mood/genre from context, then use
bgm list --search "<keyword>"to pre-filter. Present 5–8 tracks (Agent decides which fields best represent each track — e.g., name, style description). If the user has no preference, recommend 3 tracks with a brief reason for each (e.g., "matches the film's fast-paced action tone") and wait for confirmation. Do NOT use abgmID in any task until the user confirms.
narrator-ai-cli bgm list --json # 146 tracks
narrator-ai-cli bgm list --search "单车" --json
# Returns: id (= bgm parameter in task creation)
⚠️ Agent behavior: Infer the target language from context; if ambiguous, ask the user before listing. Run
dubbing list --lang <language>to filter, then present all matching voices (typically < 15 per language) — include name and tags. If the user has no preference, recommend 3 voices with reasoning (e.g., "neutral tone fits documentary narration style") and wait for confirmation. Do NOT use a dubbingidordubbing_typein any task until the user confirms both.⚠️ Language linkage: Once the dubbing voice is confirmed, the narration script language must match. If the selected voice is not Chinese (普通话), the agent MUST set the
languageparameter in the writing task (fast-writing or generate-writing) to the corresponding language — do NOT leave it at the default"Chinese (中文)". Carry this language value forward from the dubbing selection step to the writing task creation step. If the user has already specified alanguagevalue, verify it matches the dubbing language; if they conflict, surface the mismatch and ask the user to resolve it before proceeding.
narrator-ai-cli dubbing list --json # 63 voices, 11 languages
narrator-ai-cli dubbing list --lang 普通话 --json
narrator-ai-cli dubbing list --tag 喜剧 --json
narrator-ai-cli dubbing languages --json
narrator-ai-cli dubbing tags --json
# Returns: id (= dubbing), type (= dubbing_type)
Languages: 普通话(39), English(4), 日语(3), 韩语(2), Spanish(3), Portuguese(2), German(2), French(2), Arabic(2), Thai(2), Indonesian(2).
⚠️ Agent behavior: Infer the content genre from context and run
task narration-styles --genre <genre>to pre-filter. Present 3–5 templates (Agent decides which fields best represent each). Also share the preview link https://ceex7z9m67.feishu.cn/wiki/WLPnwBysairenFkZDbicZOfKnbc to help the user browse visually. If the user has no preference, recommend 3 templates with a brief style description and reasoning, and wait for confirmation. Do NOT use alearning_model_idin any task until the user confirms.
narrator-ai-cli task narration-styles --json
narrator-ai-cli task narration-styles --genre 爆笑喜剧 --json
Genres: 热血动作, 烧脑悬疑, 励志成长, 爆笑喜剧, 灾难求生, 悬疑惊悚, 惊悚恐怖, 东方奇谈, 家庭伦理, 情感人生, 奇幻科幻, 传奇人物
Use learning_model_id from template directly — no need for popular-learning step.
⚠️ Agent behavior: Confirm the movie or drama name with the user before proceeding (ask if not yet specified). Then follow this decision flow to determine source material and
target_mode.
Decision flow:
material list --json --page 1 --size 100. Check total in the response — if total > 100, fetch subsequent pages until all items are retrieved. Search programmatically using grep -i or python3 -c piped from the JSON output — do NOT rely on the terminal display, which may be truncated. Repeat for each page until a match is found or all pages are exhausted.confirmed_movie_json from material fields (see mapping in Prerequisites § Source Files). Present the match to the user and ask which mode:
episodes_data.srt_file_id) to align narration with the original dialogue and scenes. More authentic, closer to the source. Requires episodes_data with srt_oss_key = material.srt_file_id.task search-movie (see command below) → target_mode=1. Use returned confirmed_movie_json. No episodes_data.task search-movie for confirmed_movie_json → target_mode=2. Use uploaded SRT as srt_oss_key in episodes_data.target_mode=3. confirmed_movie_json is optional. Use uploaded SRT in episodes_data.search-movie command (run only for flows 3 and 4 above; never fabricate its output):
narrator-ai-cli task search-movie "飞驰人生" --json
Returns up to 3 results. Each result contains:
{
"title": "string",
"local_title": "string",
"year": "string",
"director": "string",
"stars": ["string"],
"genre": "string",
"summary": "string"
}
⚠️ May take 60+ seconds (Gradio backend). Results cached 24h.
Using the target_mode, confirmed_movie_json, and episodes_data determined in Step 0, create the fast-writing task:
# Case A1: Pre-built material found, user chose pure narration (target_mode=1)
# No episodes_data. confirmed_movie_json mapped from material fields — see Prerequisites § Source Files.
narrator-ai-cli task create fast-writing --json -d @request.json
# request.json:
# {
# "learning_model_id": "...",
# "target_mode": "1",
# "playlet_name": "飞驰人生",
# "confirmed_movie_json": {<mapped from material — see field mapping table in Prerequisites>},
# "model": "flash"
# }
# Case A2: Pre-built material found, user chose original mix (target_mode=2)
# episodes_data uses material.srt_file_id. confirmed_movie_json from material fields.
narrator-ai-cli task create fast-writing --json -d @request.json
# request.json:
# {
# "learning_model_id": "...",
# "target_mode": "2",
# "playlet_name": "飞驰人生",
# "confirmed_movie_json": {<mapped from material — see field mapping table in Prerequisites>},
# "episodes_data": [{"srt_oss_key": "<material.srt_file_id>", "num": 1}],
# "model": "flash"
# }
# Case B: Not in pre-built materials, known movie (target_mode=1) — run search-movie in Step 0
narrator-ai-cli task create fast-writing --json -d @request.json
# request.json: {"learning_model_id": "...", "target_mode": "1", "playlet_name": "...",
# "confirmed_movie_json": {<from search-movie>}, "model": "flash"}
# Case C: User's own SRT, known movie (target_mode=2) — run search-movie in Step 0 for confirmed_movie_json
narrator-ai-cli task create fast-writing --json -d @request.json
# request.json: {"learning_model_id": "...", "target_mode": "2", "playlet_name": "<drama name>",
# "confirmed_movie_json": {<from search-movie>}, "episodes_data": [{"srt_oss_key": "<uploaded srt file_id>", "num": 1}], "model": "flash"}
# Case D: Obscure/new drama, user's own SRT (target_mode=3) — confirmed_movie_json optional
narrator-ai-cli task create fast-writing --json -d '{
"learning_model_id": "<from narration-styles>",
"target_mode": "3",
"playlet_name": "<drama name>",
"episodes_data": [{"srt_oss_key": "<uploaded srt file_id>", "num": 1}],
"model": "flash"
}'
Full parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
learning_model_id | str | Exactly one (mutually exclusive with learning_srt) | - | Style model ID from a pre-built template or popular-learning result. Do not provide both. |
learning_srt | str | Exactly one (mutually exclusive with learning_model_id) | - | Reference SRT file_id. Only use when no template or popular-learning model is available. Do not provide both. |
target_mode | str | Yes | - | "1"=Hot Drama, "2"=Original Mix, "3"=New Drama |
playlet_name | str | Yes | - | Movie/drama name |
playlet_num | str | No | "1" | Episode/part number. Use "1" for single-episode content; increment for multi-part series. |
confirmed_movie_json | obj | mode=1,2; optional mode=3 | - | From material data (mode=2 pre-built) or search-movie result (mode=1, mode=2 user SRT). Never fabricate. |
episodes_data | list | mode=2,3 | - | For fast-writing: [{srt_oss_key, num}]. For fast-clip-data: [{video_oss_key, srt_oss_key, negative_oss_key, num}] — the video fields are added at the clip-data step. |
model | str | No | "pro" | "pro" (higher quality, 15pts/char) or "flash" (faster, 5pts/char) |
language | str | No | "Chinese (中文)" | Output language for the narration script. Must match the selected dubbing voice language. If the dubbing voice is non-Chinese, this param must be set explicitly — never leave it at the default when a non-Chinese voice is selected. |
perspective | str | No | "third_person" | "first_person" or "third_person" |
target_character_name | str | 1st person | - | Required when perspective=first_person |
custom_script_result_path | str | No | - | Custom script result path |
webhook_url | str | No | - | Async callback URL |
webhook_token | str | No | - | Callback authentication token |
webhook_data | str | No | - | Passthrough data for callback |
Output: Creation response contains only data.task_id. Poll task query <task_id> --json every 5 seconds until status=2. The completed task response contains file_ids:
{
"tasks": [{
"task_id": "<task_id>",
"order_num": "<order_num>"
}],
"file_ids": ["<file_id>"]
}
Save: task_id from the creation response (for fast-clip-data task_id input). Save file_ids[0] from the completed task poll response (for fast-clip-data file_id input).
Input: task_id and file_id from Fast Writing (step 1), plus bgm, dubbing, episodes_data.
narrator-ai-cli task create fast-clip-data --json -d '{
"task_id": "<task_id from step 1>",
"file_id": "<file_id from step 1>",
"bgm": "<bgm_id>",
"dubbing": "<voice_id>",
"dubbing_type": "<dubbing_type from selected voice>",
"episodes_data": [{"video_oss_key": "<video_file_id>", "srt_oss_key": "<srt_file_id>", "negative_oss_key": "<video_file_id>", "num": 1}]
}'
Output: Creation response:
{"code": 10000, "message": "", "data": {"task_id": ""}}
Save data.task_id. Poll task query <task_id> --json every 5 seconds until status=2. On success, read task_order_num from the task record — this is the order_num required for video-composing (step 3).
IMPORTANT: order_num comes from fast-clip-data (step 2) task record's task_order_num. This is the only required parameter.
narrator-ai-cli task create video-composing --json -d '{
"order_num": "<task_order_num>"
}'
Output: On creation returns data.task_id. Poll task query <task_id> --json every 5 seconds until status=2. Extract video_url from results:
{
"tasks": [{
"video_url": "https://oss.example.com/.../output.mp4"
}]
}
Note: type_name is video_composing (no BGM) or video_composing_2 (with BGM); both return video_url in the same structure.
⚠️ Agent restriction: Do NOT auto-create magic-video tasks. Only create when the user explicitly requests a visual template. Present the template catalog, explain options, let the user choose. Multiple templates can be selected — each produces a separate output video.
Visual Templates is a value-added service applied after video composing:
Fetch real-time template details (params, descriptions, pricing):
curl -X GET "https://openapi.jieshuo.cn/v2/task/commentary/get_magic_template_info" \
-H "app-key: $NARRATOR_APP_KEY"
Templates are organized by distribution platform and aspect ratio:
油管 (YouTube)
| Aspect Ratio | Template Name | Configurable Params |
|---|---|---|
| 9:16 垂直 | 竖屏·合规剧集 | 主标题, 底部免责文案, 侧边警示语, 分集设置 |
| 9:16 垂直 | 竖屏·柔光剧集 | 分集设置 |
| 9:16 垂直 | 竖屏·模糊剧集 | 主标题, 分集设置 |
| 9:16 垂直 | 竖屏·简约剧集 | 分集设置 |
| 9:16 垂直 | 竖屏·黑金剧集 | 主标题, 副标题, 分集设置 |
| 16:9 水平 | 横屏·沉浸剧集 | 分集设置 |
| 16:9 水平 | 横屏·电影剧集 | 主标题, 副标题, 分集设置 |
| 16:9 水平 | 横屏·简约剧集 | 分集设置 |
抖音 (TikTok / Douyin)
| Aspect Ratio | Template Name | Configurable Params |
|---|---|---|
| 1:1 矩形 | 方屏·简约剧集 | 主标题, 水印文案, 分集设置 |
| 1:1 矩形 | 方屏·雅致剧集 | 主标题, 分集设置 |
| 9:16 垂直 | 竖屏·流光剧集 | 顶部标语, 侧边文案, 分集设置 |
油管短视频 (YouTube Shorts)
| Aspect Ratio | Template Name | Configurable Params |
|---|---|---|
| 9:16 垂直 | 竖屏·精准剧集 | 分集设置 |
| 9:16 垂直 | 竖屏·重磅剧集 | 副标题 ⚠️, 分集设置 |
⚠️ Agent behavior: When the user selects a template, proactively walk through each of its configurable params, explain what it controls, and ask the user for a value. Only proceed to task creation once every param is confirmed or explicitly left at default.
⚠️ Language awareness: All text params (
main_title,sub_title,bottom_disclaimer_text,vertical_text_content,watermark_text,slogan) have Chinese default values hardcoded in the template and do NOT auto-adapt to the target language. When the narration target language is not Chinese, the agent MUST:
- Never submit Chinese default values. Submitting Chinese defaults will result in Chinese text appearing in a non-Chinese video — this is always wrong.
- Proactively provide localized values for every text param in the template. Do not ask the user whether they want localization — assume yes and act on it.
- Translate the standard defaults to the target language and confirm with the user before submitting. Do not skip this — even if the user hasn't mentioned it. Required translations by language:
bottom_disclaimer_textdefault本故事纯属虚构 请勿模仿→ e.g. English:This story is purely fictional. Do not imitate.vertical_text_contentdefault影视效果 请勿模仿 合理安排生活→ e.g. English:Cinematic effects only. Do not imitate. Manage your life wisely.main_title,sub_title,watermark_text,slogan— if left empty, AI may still generate Chinese; proactively ask for user input or suggest a translated value.- This rule applies even when the user does not explicitly mention language. The target language flows through the entire pipeline as a single chain: dubbing voice language → narration script
languageparam → magic-video template text params. If the dubbing voice is non-Chinese, all three must be set to the matching language. Never treat these as independent decisions.- All user-facing questions in this section (the "Ask the user" prompts below) must be asked in the same language as the ongoing conversation. Do not default to Chinese if the conversation is in another language.
- Scope note: This rule governs magic-video template text params only. The
languageparam in fast-writing / generate-writing controls the narration script language and is handled at the writing step. Both are downstream consequences of the dubbing language selection and must be consistent.
All params are optional — omitting them lets AI auto-generate where supported. The table below explains what each param does and how to fill it appropriately.
segment_count — 分集设置 (int, present in all templates)
Controls how the video is split into episodes:
| Value | Behavior | When to use |
|---|---|---|
0 (default) | AI auto-determines episode count based on content length | Recommended for most cases; let AI decide |
-1 | No splitting — output as a single video | When the source is short or the user wants one file |
1, 2, 3… | Force exactly N episodes | When the user has a specific series structure in mind |
Ask the user: "要分集吗?留 0 让 AI 自动判断,还是指定集数,或者 -1 不分集?"
main_title — 主标题 (string, templates: 竖屏·合规剧集, 竖屏·模糊剧集, 竖屏·黑金剧集, 横屏·电影剧集, 方屏·简约剧集, 方屏·雅致剧集)
The primary title displayed prominently on screen.
Ask the user whether they want a custom title, or prefer AI to generate one. (Ask in the conversation language — see Language Awareness rule 5.)
sub_title — 副标题 (string, templates: 竖屏·黑金剧集, 横屏·电影剧集, 竖屏·重磅剧集)
Secondary text displayed near the main title.
sub_title will completely override the main title display — the value you enter replaces whatever would appear as the main title. Only fill this if the user specifically wants to override the title.Ask the user whether they want a custom subtitle. For 竖屏·重磅剧集, warn that filling this field will override the main title. (Ask in the conversation language — see Language Awareness rule 5.)
bottom_disclaimer_text — 底部免责文案 (string, template: 竖屏·合规剧集 only)
Disclaimer text pinned to the bottom of the screen — required for compliance on many platforms.
本故事纯属虚构 请勿模仿 — covers standard platform compliance requirementsThis story is purely fictional. Do not imitate.) and confirm with the user before submitting. Do not submit the Chinese default for non-Chinese narration.Chinese narration: "底部免责文案保留默认「本故事纯属虚构 请勿模仿」就好,有特殊合规需求才需要改。" — Non-Chinese narration: Translate the default to the target language, show the translated value to the user, and ask for confirmation or edits before submitting.
vertical_text_content — 侧边警示语 / 侧边文案 (string, templates: 竖屏·合规剧集, 竖屏·流光剧集)
Vertical text displayed along the side edge of the screen.
影视效果 请勿模仿 合理安排生活 — standard compliance phrasingCinematic effects only. Do not imitate. Manage your life wisely.) and confirm with the user before submitting. Do not submit the Chinese default for non-Chinese narration.Chinese narration: "侧边文案保留默认合规文案即可,如需换成频道专属文案可以自定义。" — Non-Chinese narration: Translate the default to the target language, show the translated value to the user, and ask for confirmation or edits before submitting.
watermark_text — 水印文案 (string, template: 方屏·简约剧集 only)
Copyright/brand text that roams randomly across the frame as a floating watermark.
@ChannelName, © Studio Name)Ask the user if they want a watermark. If yes, ask for the text. (Ask in the conversation language — see Language Awareness rule 5.)
slogan — 顶部标语 (string, template: 竖屏·流光剧集 only)
Custom text that fills the entire top title bar, overriding whatever the AI would generate.
Ask the user if they want a fixed top slogan. (Ask in the conversation language — see Language Awareness rule 5.)
Input is the task_id returned from video-composing (step 3).
⚠️ Agent behavior — mandatory pre-submission confirmation: Before running any
magic-videocreate command, the agent MUST display the full request parameters to the user in a readable format (templates selected, alltemplate_paramsvalues for each template), then explicitly ask for confirmation. Do NOT submit until the user confirms. This applies every time amagic-videotask is created — including multiple calls within the same session. Ask in the conversation language (not necessarily Chinese).
# Without custom params (AI handles all defaults)
narrator-ai-cli task create magic-video --json -d '{
"task_id": "<task_id from step 3>",
"template_name": ["竖屏·黑金剧集", "横屏·电影剧集"]
}'
# With custom params — key is template name, value is a params dict
narrator-ai-cli task create magic-video --json -d '{
"task_id": "<task_id from step 3>",
"template_name": ["竖屏·合规剧集"],
"template_params": {
"竖屏·合规剧集": {
"segment_count": 0,
"bottom_disclaimer_text": "本故事纯属虚构 请勿模仿",
"vertical_text_content": "影视效果 请勿模仿 合理安排生活"
}
}
}'
Output: sub_tasks array — one entry per template, each with a rendered video URL
⚠️ Agent behavior: Confirm the movie or drama name with the user before proceeding. For material list usage, pagination, and programmatic search, see Prerequisites § Source Files.
Decision flow:
material list --json (all pages). Search programmatically — do NOT rely on terminal display.video_file_id as video_oss_key/negative_oss_key and srt_file_id as srt_oss_key in episodes_data for Step 2 (generate-writing). No need to upload files.file upload (see Prerequisites § Source Files). Use the returned file_id values as video_oss_key/negative_oss_key and srt_oss_key in episodes_data.narrator-ai-cli task create popular-learning --json -d '{
"video_srt_path": "<srt_file_id from Step 0>",
"narrator_type": "movie",
"model_version": "advanced"
}'
narrator_type options: 短剧 电影 第一人称电影 多语种电影 第一人称多语种 movie short_drama first_person_movie multilingual first_person_multilingual
model_version: advanced (高级版) or standard (标准版)
Output: On creation returns data.task_id. Poll task query <task_id> --json every 5 seconds until status=2. Parse task_result JSON string → agent_unique_code is the learning_model_id:
{
"tasks": [{
"task_result": "{\"agent_unique_code\": \"narrator-20251121160424-wjtOXO\"}"
}]
}
→ learning_model_id = "narrator-20251121160424-wjtOXO"
Alternatively, use a pre-built template id from task narration-styles --json as learning_model_id directly — no popular-learning step needed.
Input: Use video_file_id and srt_file_id determined in Step 0 to construct episodes_data:
episodes_data field | Source |
|---|---|
video_oss_key | video_file_id from material (Step 0) or uploaded video file_id |
negative_oss_key | same as video_oss_key |
srt_oss_key | srt_file_id from material (Step 0) or uploaded SRT file_id |
num | episode number, starting from 1 |
narrator-ai-cli task create generate-writing --json -d '{
"learning_model_id": "<from step 1 or pre-built template>",
"playlet_name": "Movie Name",
"playlet_num": "1",
"episodes_data": [{"video_oss_key": "<video_file_id>", "srt_oss_key": "<srt_file_id>", "negative_oss_key": "<video_file_id>", "num": 1}],
"refine_srt_gaps": false
}'
Optional: refine_srt_gaps (bool) — enables AI scene analysis. Only set to true when user explicitly requests it.
⚠️ Language linkage: If the selected dubbing voice is non-Chinese, add
"language": "<target language>"to this request to match. Do not omit this param for non-Chinese dubbing — the default is Chinese.
Output: On creation returns data.task_id. Poll task query <task_id> --json every 5 seconds until status=2. Extract task_result (narration script file path) and order_info from results:
{
"tasks": [{
"task_result": "video-clips-data/20251126/narrator/t_66449_47KIRY/narration.txt"
}],
"order_info": {
"order_num": "script_69269bfc_GfVEgA"
}
}
Save: task_id from the initial creation response — required as input for clip-data step.
Input: task_id from generate-writing (step 2), plus bgm and dubbing.
narrator-ai-cli task create clip-data --json -d '{
"task_id": "<task_id from step 2 (generate-writing) creation response>",
"bgm": "<bgm_id>",
"dubbing": "<voice_id>",
"dubbing_type": "<dubbing_type from selected voice>"
}'
Output: Creation response:
{"code": 10000, "message": "", "data": {"task_id": ""}}
Save data.task_id. Poll task query <task_id> --json every 5 seconds until status=2. On success, read task_order_num from the task record — this is the order_num required for video-composing (step 4).
Same commands as Fast Path Steps 3–4. The only difference: order_num for video-composing comes from clip-data (this step's Step 3) task_order_num, not from fast-clip-data. In both paths, video-composing always uses the task_order_num from the immediately preceding clip step.
narrator-ai-cli task create voice-clone --json -d '{"audio_file_id": "<file_id>"}'
Optional: clone_model (default: pro). Output: task_id, voice_id.
narrator-ai-cli task create tts --json -d '{"voice_id": "<voice_id>", "audio_text": "Text to speak"}'
Optional: clone_model (default: pro). Output: task_id with audio result.
⚠️ Agent behavior — standard polling pattern: Always use the
whileloop below when monitoring a task. Never use aforloop with a fixed iteration count (it may exhaust before the task finishes). The loop below runs until status2(success) or3(failed) and cannot be silently interrupted mid-run.
# Standard polling loop — use this every time a task needs to be monitored
TASK_ID="<task_id>"
while true; do
result=$(narrator-ai-cli task query "$TASK_ID" --json 2>&1)
status=$(echo "$result" | python3 -c "
import json, sys