Use when user provides a topic and wants an automated video podcast created, OR when user wants to learn/analyze video design patterns from reference videos — handles research, script writing, TTS audio synthesis, Remotion video creation, and final MP4 output with background music. Also supports design learning from reference videos (learn command), style profile management, and design reference library. Supports Bilibili, YouTube, Xiaohongshu, Douyin, and WeChat Channels platforms with independent language configuration (zh-CN, en-US).
REQUIRED: Load Remotion Best Practices First
This skill depends on
remotion-best-practices. You MUST invoke it before proceeding:Invoke the skill/tool named: remotion-best-practices
Open your coding agent and say: "Make a video podcast about $ARGUMENTS"
Or invoke directly: /video-podcast-maker AI Agent tutorial
Extract visual design patterns from reference videos or images, store them in a searchable library, and apply them to new video compositions.
# Learn from images (use your agent's image analysis capability to analyze design patterns)
python3 learn_design.py ./screenshot1.png ./screenshot2.png
# Learn from a local video (ffmpeg extracts frames automatically)
python3 learn_design.py ./reference.mp4
# Learn from a URL (Playwright captures screenshots — experimental)
python3 learn_design.py https://www.bilibili.com/video/BV1xx411c7mD
# Save with a named profile and tags
python3 learn_design.py ./reference.mp4 --profile "tech-minimal" --tags "tech,minimal,dark"
references list # List all stored references (auto-cleans orphaned entries)
references show <id> # Show full design report for a reference
references delete <id> # Delete a reference and its files
profiles list # List all saved style profiles
profiles show <name> # Show profile props_override
profiles delete <name> # Delete a style profile
profiles create <name> # Create a new style profile interactively
When the user provides a reference video or image alongside a video creation request, extract design patterns before Step 1 and apply them as session overrides. See references/workflow-steps.md → Pre-workflow section for the full extraction flow.
Before choosing visual design in Step 9, check for matching style profiles or reference library entries. Apply the best match as a starting point for Remotion composition props. See references/workflow-steps.md → Step 9 Style Profile Integration for the priority chain.
Agent behavior: Check for updates at most once per day (throttled by timestamp file).
Before any shell command that reads files from this skill, resolve SKILL_DIR to the directory containing SKILL.md.
If your agent exposes a built-in skill directory variable such as ${CLAUDE_SKILL_DIR}, you may map it to SKILL_DIR.
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"
STAMP="${SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ $((NOW - LAST)) -gt 86400 ]; then
timeout 5 git -C "${SKILL_DIR}" fetch --quiet 2>/dev/null || true
LOCAL=$(git -C "${SKILL_DIR}" rev-parse HEAD 2>/dev/null)
REMOTE=$(git -C "${SKILL_DIR}" rev-parse origin/main 2>/dev/null)
echo "$NOW" > "$STAMP"
if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
echo "UPDATE_AVAILABLE"
else
echo "UP_TO_DATE"
fi
else
echo "SKIPPED_RECENT_CHECK"
fi
git -C "${SKILL_DIR}" pull. No → continue.!( missing=""; node -v >/dev/null 2>&1 || missing="$missing node"; python3 --version >/dev/null 2>&1 || missing="$missing python3"; ffmpeg -version >/dev/null 2>&1 || missing="$missing ffmpeg"; [ -n "$AZURE_SPEECH_KEY" ] || missing="$missing AZURE_SPEECH_KEY"; if [ -n "$missing" ]; then echo "MISSING:$missing"; else echo "ALL_OK"; fi )
If MISSING reported above, see README.md for full setup instructions (install commands, API key setup, Remotion project init).
Automated pipeline for professional Bilibili horizontal knowledge videos from a topic.
Target: Bilibili horizontal video (16:9)
- Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
- Style: Clean white (default)
Tech stack: Coding agent + TTS backend + Remotion + FFmpeg
| Parameter | Horizontal (16:9) | Vertical (9:16) |
|---|---|---|
| Resolution | 3840×2160 (4K) | 2160×3840 (4K) |
| Frame rate | 30 fps | 30 fps |
| Encoding | H.264, 16Mbps | H.264, 16Mbps |
| Audio | AAC, 192kbps | AAC, 192kbps |
| Duration | 1-15 min | 60-90s (highlight) |
Agent behavior: Detect user intent at workflow start:
Full pipeline with sensible defaults. Mandatory stop at Step 9:
| Step | Decision | Auto Default |
|---|---|---|
| 3 | Title position | top-center |
| 5 | Media assets | Skip (text-only animations) |
| 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) |
| 9 | Outro animation | Pre-made MP4 (white/black by theme) |
| 9 | Preview method | Remotion Studio (mandatory) |
| 12 | Subtitles | Skip |
| 14 | Cleanup | Auto-clean temp files |
Users can override any default in their initial request:
Prompts at each decision point. Activated by:
Planned feature (not yet implemented). Currently, workflow progress is tracked via the agent's conversation context. If a session is interrupted, re-invoke the skill and inspect existing files in
videos/{name}/to determine where to resume.
Hard constraints for video production. Visual design remains the agent's creative freedom within these rules:
| Rule | Requirement |
|---|---|
| Single Project | All videos under videos/{name}/ in user's Remotion project. NEVER create a new project per video. |
| 4K Output | 3840×2160, use scale(2) wrapper over 1920×1080 design space |
| Content Width | ≥85% of screen width |
| Bottom Safe Zone | Bottom 100px reserved for subtitles |
| Audio Sync | All animations driven by timing.json timestamps |
| Thumbnail | MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md. |
| Font | PingFang SC / Noto Sans SC for Chinese text |
| Studio Before Render | MUST launch remotion studio for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). |
Load these files on demand — do NOT load all at once:
timing.json format.project-root/ # Remotion project root
├── src/remotion/ # Remotion source
│ ├── compositions/ # Video composition definitions
│ ├── Root.tsx # Remotion entry
│ └── index.ts # Exports
│
├── public/ # Remotion default (unused — use --public-dir videos/{name}/)
│
├── videos/{video-name}/ # Video project assets
│ ├── workflow_state.json # Workflow progress
│ ├── topic_definition.md # Step 1
│ ├── topic_research.md # Step 2
│ ├── podcast.txt # Step 4: narration script
│ ├── podcast_audio.wav # Step 8: TTS audio
│ ├── podcast_audio.srt # Step 8: subtitles
│ ├── timing.json # Step 8: timeline
│ ├── thumbnail_*.png # Step 7
│ ├── output.mp4 # Step 10
│ ├── video_with_bgm.mp4 # Step 11
│ ├── final_video.mp4 # Step 12: final output
│ └── bgm.mp3 # Background music
│
└── remotion.config.ts
Important: Always use
--public-dirand full output path for Remotion render:npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/
Video name {video-name}: lowercase English, hyphen-separated (e.g., reference-manager-comparison)
Section name {section}: lowercase English, underscore-separated, matches [SECTION:xxx]
Thumbnail naming (16:9 AND 4:3 both required):
| Type | 16:9 | 4:3 |
|---|---|---|
| Remotion | thumbnail_remotion_16x9.png | thumbnail_remotion_4x3.png |
| AI | thumbnail_ai_16x9.png | thumbnail_ai_4x3.png |
Use --public-dir videos/{name}/ for all Remotion commands. Each video's assets (timing.json, podcast_audio.wav, bgm.mp3) stay in its own directory — no copying to public/ needed. This enables parallel renders of different videos.
# All render/studio/still commands use --public-dir
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/
At Step 1 start:
videos/{name}/workflow_state.jsonTaskCreate to create tasks per step. Mark in_progress on start, completed on finish.workflow_state.json AND TaskUpdate. 1. Define topic direction → topic_definition.md
2. Research topic → topic_research.md
3. Design video sections (5-7 chapters)
4. Write narration script → podcast.txt
5. Collect media assets → media_manifest.json
6. Generate publish info (Part 1) → publish_info.md
7. Generate thumbnails (16:9 + 4:3) → thumbnail_*.png
8. Generate TTS audio → podcast_audio.wav, timing.json
9. Create Remotion composition + Studio preview (mandatory stop)
10. Render 4K video (only on user request) → output.mp4
11. Mix background music → video_with_bgm.mp4
12. Add subtitles (optional) → final_video.mp4
13. Complete publish info (Part 2) → chapter timestamps
14. Verify output & cleanup
15. Generate vertical shorts (optional) → shorts/
After Step 8 (TTS):
podcast_audio.wav exists and plays correctlytiming.json has all sections with correct timestampspodcast_audio.srt encoding is UTF-8After Step 10 (Render):
output.mp4 resolution is 3840x2160See CLAUDE.md for the full command reference (TTS, Remotion, FFmpeg, shorts generation).
Skill learns and applies preferences automatically. See references/troubleshooting.md for commands and learning details.
| File | Purpose |
|---|---|
user_prefs.json | Learned preferences (auto-created from template) |
user_prefs.template.json | Default values |
prefs_schema.json | JSON schema definition |
Final = merge(Root.tsx defaults < global < topic_patterns[type] < current instructions)
| Command | Effect |
|---|---|
| "show preferences" | Show current preferences |
| "reset preferences" | Reset to defaults |
| "save as X default" | Save to topic_patterns |
Full reference: Read references/troubleshooting.md on errors, preference questions, or BGM options.