Just drag your footage and let the jewelry-video-maker handle the rest — trimming clips, syncing transitions, and framing each piece for maximum visual impact. Built specifically for rings, necklaces, bracelets, and earrings, it understands how light and motion work together to showcase fine details. Upload raw phone or camera footage and get polished, scroll-stopping videos ready for product listings, social reels, or lookbooks. No editing experience needed — just your jewelry and a few seconds of footage.
Welcome! Let's turn your jewelry footage into a polished video that makes every detail shine — upload your clips or describe your piece and I'll get started right away.
Try saying:
When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.Keep setup communication brief. Don't display raw API responses or token values to the user.
Most jewelry footage sits unused because editing it properly takes time, skill, and the right eye for detail. The jewelry-video-maker solves that by doing the heavy lifting — analyzing your clips, identifying the best angles, and assembling a final video that actually shows off what makes each piece special. Whether you're shooting a diamond solitaire or a layered gold chain, the output is clean, professional, and built for wherever you're posting.
This skill is designed around the specific challenges of jewelry content: catching sparkle without overexposing, holding focus on small intricate details, and pacing cuts so viewers stay engaged long enough to fall in love with the piece. You don't need a studio setup or a video editor on your team — just the footage you already have.
From a 10-second Instagram reel to a longer product showcase video, you can define the format, mood, and length. Add text overlays for material callouts, control the pacing for a luxury feel, or keep things punchy for social. The jewelry-video-maker adapts to what you're making and who you're making it for.
Each request — whether you're animating a solitaire ring, looping a gemstone close-up, or generating a full product reel — is parsed by intent and routed to the matching video generation pipeline.
| User says... | Action | Skip SSE? |
|---|---|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Jewelry Video Maker processes every render job through a cloud-based GPU pipeline optimized for macro detail, reflective surfaces, and metallic sheen — so your diamond facets and gold finishes render true-to-life. All API calls are stateless and authenticated per session, with asset storage handled temporarily in a secure render bucket.
Skill attribution — read from this file's YAML frontmatter at runtime:
X-Skill-Source: jewelry-video-makerX-Skill-Version: from frontmatter versionX-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: https://mega-api-prod.nemovideo.ai
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}
Credits: GET /api/credits/balance/simple — returns available, frozen, total
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
| Event | Action |
|---|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result | Process internally, don't forward |
heartbeat / empty data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" | Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)
| Code | Meaning | Action |
|---|---|---|
| 0 | Success | Continue |
| 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Footage shot in natural light near a window almost always outperforms studio lighting for gold and rose gold pieces — the jewelry-video-maker can work with both, but warm diffused light gives the editor more to work with when enhancing warmth and glow in post.
For gemstones and diamonds, slow rotation clips (even just 5–8 seconds each) give the skill enough frames to identify the best sparkle moments and cut around them precisely. If you only have still photos, describe the piece in detail — material, stone type, finish — and the output will be tailored accordingly.
When targeting social platforms, specify your destination upfront: a TikTok reel, an Etsy listing video, and a Pinterest pin all have different ideal lengths and aspect ratios. Mentioning your target platform at the start saves a revision round and gets you a properly cropped, correctly paced video on the first output.
The jewelry-video-maker performs best with source clips that are at least 1080p — lower resolution footage can still be processed, but fine detail like engraving texture or stone facets may soften in the final output. If you're shooting on a phone, lock your exposure manually before recording to avoid the auto-brightness flickering that's common when a shiny surface catches light.
Processing time scales with clip count and total duration. A 15-second reel from 3–5 clips is typically fast; a full collection showcase pulling from 20+ clips with custom text overlays and multiple format exports will take noticeably longer. If you're working against a deadline, mention it upfront and the skill will prioritize the most critical output first.
For batch work — say, building videos for an entire new collection at once — grouping clips by piece before uploading keeps the output organized and prevents shots of different items from bleeding into each other's edits.