Generate images, videos, audio, and music using 50+ fal.ai AI models including Flux, Veo, Kling, Minimax, ElevenLabs, and more. Supports text-to-image, image-to-image, text-to-video, image-to-video, text-to-speech, text-to-music, audio transcription, voice changing, and video-to-audio generation.
A Claude Code skill for generating images, videos, audio, music, and more using the fal.ai API. The skill wraps a Go CLI binary that communicates with fal.ai's async queue API. Every action is a JSON object piped into the CLI binary, which submits the job, polls for completion, and returns the result as JSON.
Once the API key is configured, this skill lets you:
All generation is async: the CLI submits a job to fal.ai's queue, polls until completion, and returns the result with media URLs. Files can optionally be downloaded to a local directory.
The skill provides three helper scripts at ~/.claude/skills/falai/scripts/:
detect-config.sh — checks if the API key is configuredinvoke.sh — runs a JSON action against the CLI binaryinit-config.sh — interactive API key setup (for non-Claude users)Before running any action, check if the API key is configured:
~/.claude/skills/falai/scripts/detect-config.sh
On success, this prints the config file path. On failure, it exits non-zero and the first-run flow should be used.
If detect-config.sh exits non-zero, walk the user through setup:
Announce the flow: explain that the fal.ai skill needs an API key to generate images, videos, and audio.
Collect the API key via AskUserQuestion: ask for the fal.ai API key. Direct the user to https://fal.ai/dashboard/keys if they don't have one. NEVER echo or display the API key in output. The API key MUST be treated as a secret.
Write the config file to ~/.config/falai/config.yaml:
mkdir -p ~/.config/falai && chmod 700 ~/.config/falai
cat > ~/.config/falai/config.yaml << EOF
api_key: "THE_KEY_HERE"
timeout: 300
EOF
chmod 600 ~/.config/falai/config.yaml
Verify with a ping:
echo '{"action":"ping"}' | ~/.claude/skills/falai/scripts/invoke.sh
Expect {"ok":true,...} with "status":"ok". If the response shows AUTH_ERROR, the key is invalid.
Report ready and continue with the original request.
Once configured, every action is a single invoke.sh call. Pass the JSON payload on stdin or as the first argument.
# JSON on stdin
echo '{"action":"ping"}' | ~/.claude/skills/falai/scripts/invoke.sh
# JSON as argument
~/.claude/skills/falai/scripts/invoke.sh '{"action":"generate-image","model":"flux/schnell","prompt":"a cat in space"}'
# With download
~/.claude/skills/falai/scripts/invoke.sh '{"action":"generate-image","model":"flux/schnell","prompt":"sunset over mountains","download":true,"output_dir":"./output"}'
Full action reference with required fields, optional fields, and response shapes: reference/actions.md. Copy-paste examples: reference/examples.md.
Rule: before constructing a payload for an action you have not used in the current session, consult reference/actions.md. Do not guess field names.
Supported actions:
ping, list-modelsgenerate-image (text-to-image and image-to-image)generate-video (text-to-video and image-to-video)generate-speech (text-to-speech and voice changing)generate-music (text-to-music)transcribe (audio-to-text)check-job, get-job-result (async job management)When the user asks to generate content without specifying a model, use AskUserQuestion to present appropriate choices. Here are the recommended defaults by use case:
Quick image generation (fastest):
flux/schnell — fast, good quality, best default for quick iterationfast-sdxl — very fast, lower qualityHigh-quality images:
flux-pro/v1.1-ultra — highest quality, supports aspect ratiosflux-2-pro — professional next-gen qualityhidream-i1-full — HiDream 17B, detailed images, supports negative promptsImage editing:
flux-2-pro/edit — edit images with text instructions (requires image_urls)flux-2/edit — affordable image editing (requires image_urls)Style transfer:
ghiblify — Studio Ghibli anime style (requires image_url)cartoonify — Pixar-style cartoon (requires image_url)star-vector — SVG vectorization (requires image_url)Video from image (image-to-video):
veo2 — Google Veo2, high qualityveo3 — Google Veo3, latest generationkling-video-v3-image — Kling V3, very high qualityminimax/video-01-subject-reference — preserves subject identityluma-dream-machine — Luma Labsseedance-2.0-image — ByteDance SeedanceVideo from text (text-to-video):
minimax/video-01-director — with camera movement controlminimax/video-01 — high resolution, high frame rateminimax/hailuo-02 — latest Minimax modelkling-video-v3-text — Kling V3 from texthunyuan-video — Tencent Hunyuanseedance-2.0-text — ByteDance Seedance from textText-to-speech:
elevenlabs/tts/turbo-v2.5 — ElevenLabs, highest qualityminimax-tts/text-to-speech — Minimax TTSchatterbox-tts — Chatterbox, open modelelevenlabs-dialog — ElevenLabs multi-speaker dialogMusic:
minimax-music-v2 — Minimax music generationstable-audio-25 — Stability AI musicTranscription:
elevenlabs/speech-to-text/scribe-v2 — ElevenLabs with speaker diarizationVoice changing:
elevenlabs-voice-changer — change voice characteristics (requires audio_url)Video to audio:
mmaudio-v2 — generate audio from video contentWhen multiple models fit the user's request, present 2-3 options via AskUserQuestion with brief descriptions of the tradeoffs (speed vs quality, cost, style).
Exit codes from invoke.sh:
0 — success. stdout contains the JSON result.1 — bad input, missing action, or binary not found.2 — config error (missing API key).3 — API error (fal.ai returned an error, job failed, timeout).Error responses have the shape:
{"ok": false, "error": "description", "code": "ERROR_CODE"}
Error codes: INVALID_INPUT, INVALID_MODEL, UNKNOWN_ACTION, CONFIG_ERROR, AUTH_ERROR, API_ERROR, HANDLER_ERROR, MISSING_BINARY.
When you see an error, consult reference/errors.md for cause and fix. Do not retry blindly.
Generation requests cost real money against the user's fal.ai balance. Before running expensive operations:
A user asking to "generate an image" is implicit consent for a single generation. Batch generation or expensive model choices should be confirmed.
When the fal package in braibot is updated with new models, rebuild the skill:
cd /path/to/claude-skill-falai
go get -u github.com/karamble/braibot@latest
go build -o skill/bin/falai-cli ./cmd/falai-cli/
./install.sh --force
After updating, run list-models to see newly available models. New models registered in the fal package are automatically available without any CLI code changes.
Generate a quick image:
list-models (optional — to show the user available image models)generate-image with model: "flux/schnell", prompt, optional download: trueGenerate a high-quality image:
AskUserQuestion (flux-pro/v1.1-ultra, flux-2-pro, or hidream-i1-full)generate-image with the chosen model and promptCreate a video from an image:
generate-video with model, image_url, prompt (for motion guidance)Text-to-speech:
generate-speech with model: "elevenlabs-tts-turbo-v2.5", text: "...", optional voice settings in optionsTranscribe audio:
transcribe with model: "elevenlabs-scribe-v2", audio_url: "..."Check a long-running job:
response_url from the initial submission (in stderr progress output)check-job with response_url to see current statusget-job-result once status is COMPLETEDdownload: true to save files locally for permanent access.Actions and models:
reference/actions.md — full action referencereference/models.md — model catalog by categoryreference/examples.md — copy-paste JSON payloadsreference/errors.md — error codes and fixesreference/troubleshooting.md — common issuesConfiguration:
templates/config.yaml.example — config file schemaHelper scripts:
scripts/detect-config.sh — checks config existsscripts/invoke.sh — runs JSON actionsscripts/init-config.sh — interactive API key setup