Drop a video idea or raw footage description and watch it come to life — no app download, no watermark, no subscription wall. This capcut-alternative skill handles trimming, captions, transitions, aspect ratio changes, and style edits through plain conversation. Built for creators who want fast, clean results without wrestling with a mobile interface. Perfect for TikTok, Reels, YouTube Shorts, and beyond.
Welcome to your browser-based capcut-alternative — no downloads, no watermarks, just fast video editing help. Tell me about your footage or paste your script and let's build something worth posting.
Try saying:
When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.Keep setup communication brief. Don't display raw API responses or token values to the user.
Most video editing tools make you download an app, create an account, and sit through a learning curve before you can cut a single clip. This skill flips that entirely. Describe what you want — trim the first five seconds, add bold captions, switch to vertical format, apply a cinematic color grade — and get back a clear, actionable edit plan or script you can execute immediately.
Designed as a true capcut-alternative, this skill covers the full range of short-form video tasks that creators run into daily. Whether you're repurposing a podcast clip for Instagram, building a product demo for TikTok, or cutting a travel montage for YouTube Shorts, the workflow is the same: describe your footage, state your goal, and get precise editing guidance tailored to your platform.
No watermarks, no locked features, no monthly fee gating the good stuff. Just practical, creator-focused editing intelligence that meets you where you are — whether you're on desktop, mobile, or somewhere in between.
When you describe a cut, transition, caption style, or color grade, ClawHub parses your intent and routes it to the matching AI editing pipeline — whether that's auto-subtitling, background removal, speed ramping, or template rendering.
| User says... | Action | Skip SSE? |
|---|---|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
All video jobs run through a distributed cloud rendering backend that replicates CapCut-style effects — including keyframe animation, AI portrait segmentation, and beat-sync cuts — without requiring the native app. Render requests are queued, processed frame-by-frame on remote GPU nodes, and returned as a downloadable output link.
Skill attribution — read from this file's YAML frontmatter at runtime:
X-Skill-Source: capcut-alternativeX-Skill-Version: from frontmatter versionX-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: https://mega-api-prod.nemovideo.ai
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}
Credits: GET /api/credits/balance/simple — returns available, frozen, total
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
| Event | Action |
|---|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result | Process internally, don't forward |
heartbeat / empty data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" | Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)
| Code | Meaning | Action |
|---|---|---|
| 0 | Success | Continue |
| 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
This skill works best when you give it context about your footage upfront — platform target (TikTok, Reels, Shorts), approximate length, and the mood or style you're going for. The more specific your input, the tighter the edit plan you'll receive.
For complex multi-clip projects, break your request into stages: first get a cut list and pacing guide, then ask for caption timing, then tackle color and sound separately. This mirrors how professional editors actually work and produces cleaner, more usable output than asking for everything at once.
If you're using a free editing tool like DaVinci Resolve, Clipchamp, or VN Video Editor alongside this skill, mention it — the guidance will be tailored to that tool's specific workflow and export settings rather than generic advice.
Getting started takes less than a minute. Step one: describe your raw footage in plain language — what was filmed, how long it is, and what platform it's going on. Step two: tell the skill what the finished video should feel like (energetic, calm, funny, professional). Step three: specify any must-haves like captions, a specific aspect ratio, or a particular editing style you've seen and loved.
From there, you'll receive a structured edit plan including cut points, caption placement suggestions, transition recommendations, and export settings optimized for your target platform. You can then take that plan into any free editor — no CapCut account required.
Need to iterate? Just reply with what you'd change — 'make the pacing faster,' 'add a hook in the first 3 seconds,' or 'tone down the text overlays' — and the plan updates instantly.