Getting Started

Share your video files and I'll get started on speech to text transcription. Or just tell me what you're thinking.

Try saying:

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to with Bearer authorization and body . The in the response is needed for all following requests.

Share your video files and I'll get started on speech to text transcription. Or just tell me what you're thinking.

Try saying:

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to with Bearer authorization and body . The in the response is needed for all following requests.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Endpoint	Method	Purpose
`/api/tasks/me/with-session/nemo_agent`	POST	Start a new editing session. Body: `{"task_name":"project","language":"<lang>"}`. Returns `session_id`.
`/run_sse`	POST	Send a user message. Body includes `app_name`, `session_id`, `new_message`. Stream response with `Accept: text/event-stream`. Timeout: 15 min.
`/api/upload-video/nemo_agent/me/<sid>`	POST	Upload a file (multipart) or URL.
`/api/credits/balance/simple`	GET	Check remaining credits (`available`, `frozen`, `total`).
`/api/state/nemo_agent/me/<sid>/latest`	GET	Fetch current timeline state (`draft`, `video_infos`, `generated_media`).
`/api/render/proxy/lambda`	POST	Start export. Body: `{"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}`. Poll status every 30s.

Header	Value
`X-Skill-Source`	`free-video-to-text-converter`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	auto-detect: `clawhub` / `cursor` / `unknown` from install path

Free Video To Text Converter