Name: Video Generation
Author: michaelboeding

SkillsPool

搵技能.../

技能內容

Video Generation Skill

Generate videos using AI (Google Veo 3.1, OpenAI Sora).

Capabilities:

🎬 Text-to-Video: Create videos from text descriptions
🖼️ Image-to-Video: Animate images as the first frame
🔊 Audio Generation: Dialogue, sound effects, ambient sounds (Veo 3+)
🎭 Reference Images: Guide video content with up to 3 reference images (Veo 3.1)

Prerequisites

Default: Vertex AI (10 requests/minute) ⭐

Vertex AI is the default backend with 1400x higher rate limits:

# 1. Set your project
export GOOGLE_CLOUD_PROJECT=your-project-id

# 2. Authenticate (opens browser)
gcloud auth application-default login

# 3. Enable the API (one-time)
gcloud services enable aiplatform.googleapis.com

GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1

Use Case	Recommended Model	Reason
Best quality	veo-3.1 (default)	Highest quality, audio
Quick iteration	veo-3.1-fast	Faster processing
Batch generation	veo-3.1-fast	Speed matters for multiple clips
Longer videos (>8s)	sora	Supports up to 20s

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --prompt "your enhanced prompt with 'dialogue in quotes'" \
  --model "veo-3.1" \
  --duration 8 \
  --aspect-ratio "16:9" \
  --resolution "720p"

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --prompt "The cat slowly opens its eyes and yawns" \
  --image "/path/to/cat.jpg" \
  --model "veo-3.1" \
  --duration 8

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --prompt "your prompt" \
  --model "veo-3.1-fast"

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/sora.py \
  --prompt "your enhanced prompt" \
  --duration 20 \
  --resolution "1080p"

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py --list-models

# First, generate initial clip
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --prompt "A person walks through a forest at sunrise" \
  --duration 8

# Extend it with new content (adds ~7 seconds)
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --extend veo_veo-3.1_20260104_120000.mp4 \
  --prompt "Continue walking, discover a hidden stream"

# Extend 5 times (adds ~35 seconds of continuation)
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --extend initial_clip.mp4 \
  --prompt "Keep exploring the forest, encounter wildlife" \
  --extend-times 5

[
  {"prompt": "Scene 1: Cinematic hero shot of wireless earbuds on dark surface", "duration": 6, "output": "scene1_hero.mp4"},
  {"prompt": "Scene 2: Sound waves visualization, person enjoying music", "duration": 8, "output": "scene2_sound.mp4"},
  {"prompt": "Scene 3: Close-up of earbud in ear, person exercising", "duration": 8, "output": "scene3_comfort.mp4"},
  {"prompt": "Scene 4: Lifestyle montage, various settings", "duration": 8, "output": "scene4_lifestyle.mp4"},
  {"prompt": "Scene 5: Product with logo on clean background", "duration": 4, "output": "scene5_cta.mp4"}
]

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --batch scenes.json

python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
  --batch scenes.json \
  --max-workers 3

Option	Description	Default
`prompt`	Video description (required)	-
`model`	veo-3.1, veo-3.1-fast, etc.	veo-3.1
`duration`	4, 6, or 8 seconds	8
`aspect_ratio`	"16:9" or "9:16"	"16:9"
`resolution`	"720p" or "1080p"	"720p"
`image`	Path to image for image-to-video	-
`negative_prompt`	What to avoid	-
`output`	Custom output filename	auto-generated

Feature	Veo 3.1 (Default)	Veo 3.1 Fast	Sora
Provider	Google	Google	OpenAI
API Key	`GOOGLE_API_KEY`	`GOOGLE_API_KEY`	`OPENAI_API_KEY`
Max duration	8 seconds	8 seconds	20 seconds
Resolution	720p, 1080p	720p, 1080p	Up to 1080p
Aspect ratios	16:9, 9:16	16:9, 9:16	16:9, 9:16, 1:1
Audio (dialogue, SFX)	✅ Yes	✅ Yes	❌ No
Image-to-video	✅ Yes	✅ Yes	✅ Yes
Reference images	✅ Up to 3	✅ Up to 3	❌ No
Video extension	✅ Yes	✅ Yes	❌ No
Batch generation	✅ Yes	✅ Yes	❌ No
Speed	Best quality	~2x faster	Slower
Best for	Professional	Batch workflows	Longer videos

Video Generation | Skills Pool

Model	Description	Best For
`veo-3.1`	Highest quality (default)	Professional videos, dialogue, reference images
`veo-3.1-fast`	Faster processing	Quick iterations, batch generation

Question	Determines
Image	Image-to-video vs text-to-video
Audio	With/without audio generation
Model	Quality and speed tradeoff
Duration	Clip length
Format	Aspect ratio and resolution

Approach	Result	Use Case
Extension	True continuity, same characters/scene	Long continuous shots
Stitching	Separate clips with transitions	Scene changes, montages

Scenes	Sequential	Parallel (5 workers)	Speedup
3	~9 min	~3 min	3x
5	~15 min	~3 min	5x
10	~30 min	~6 min	5x

Video Generation

Video Generation

Video Generation Skill

Prerequisites

Default: Vertex AI (10 requests/minute) ⭐

Fallback: AI Studio (10 requests/day)

For Sora (OpenAI)

Available Models

Google Veo Models

OpenAI Sora

Workflow

Step 1: Gather Requirements (REQUIRED)

Question Flow

Quick Reference

Step 2: Craft the Prompt

Step 3: Select the Model

Step 4: Generate the Video

Video Extension (For Long-Form Continuity)

Batch Generation (Parallel)

Create a scenes.json file:

Generate all scenes in parallel:

With custom worker count:

Batch config options per video:

Speed comparison:

Step 5: Deliver the Result

Error Handling

Prompt Engineering Tips

For Audio (Veo 3.1)

For Cinematic Quality

For Realistic Motion

For Image-to-Video

Negative Prompts

API Comparison

Openai Whisper Api

Gifgrep

Camsnap

Openai Whisper

Qqbot Media

Clawhub