Image Generation Skill

This skill generates images using the best AI model for each use case. Model selection is the most important decision — read the dispatch logic carefully before generating.

🧠 Intelligent Dispatch Logic

Always select the model based on the user's actual need, not just the request surface.

Decision Tree

Does the request involve MULTIPLE images that share characters, scenes, or story continuity?
  ├─ YES → Use NANO BANANA (Gemini)
  │         Reason: Gemini understands context holistically; supports reference_images
  │         for character/scene consistency across a series (storyboard, comic, sequence)
  │
  └─ NO → Is it a SINGLE standalone image?
            ├─ Artistic / cinematic / painterly / highly detailed?
            │   → Use MIDJOURNEY
            │
            ├─ Photorealistic / portrait / product photo?
            │   → Use FLUX PRO
            │
            ├─ Contains TEXT (logo, poster, sign, infographic)?
            │   → Use IDEOGRAM
            │
            ├─ Vector / icon / flat design / brand asset?
            │   → Use RECRAFT
            │
            ├─ Quick draft / fast iteration (speed priority)?
            │   → Use FLUX SCHNELL (<2s)
            │
            └─ General purpose / balanced?
                → Use FLUX DEV

🧠 Intelligent Dispatch Logic

Always select the model based on the user's actual need, not just the request surface.

Decision Tree

Does the request involve MULTIPLE images that share characters, scenes, or story continuity? ├─ YES → Use NANO BANANA (Gemini) │ Reason: Gemini understands context holistically; supports reference_images │ for character/scene consistency across a series (storyboard, comic, sequence) │ └─ NO → Is it a SINGLE standalone image? ├─ Artistic / cinematic / painterly / highly detailed? │ → Use MIDJOURNEY │ ├─ Photorealistic / portrait / product photo? │ → Use FLUX PRO │ ├─ Contains TEXT (logo, poster, sign, infographic)? │ → Use IDEOGRAM │ ├─ Vector / icon / flat design / brand asset? │ → Use RECRAFT │ ├─ Quick draft / fast iteration (speed priority)? │ → Use FLUX SCHNELL (<2s) │ └─ General purpose / balanced? → Use FLUX DEV

Model	ID	Artistic	Photorealism	Text	Context Continuity	Speed	Cost
Midjourney	`midjourney`	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	❌ (no context)	~30s	~$0.05
Nano Banana Pro	`nano-banana`	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	~20s	$0.15
Flux Pro	`flux-pro`	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐	❌	~5s	~$0.05
Flux Dev	`flux-dev`	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	❌	~8s	~$0.03
Flux Schnell	`flux-schnell`	⭐⭐	⭐⭐⭐	⭐⭐	❌	<2s	~$0.003
Ideogram v3	`ideogram`	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐	❌	~10s	~$0.08
Recraft v3	`recraft`	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐	❌	~8s	~$0.04
SDXL Lightning	`sdxl`	⭐⭐⭐	⭐⭐⭐	⭐⭐	❌	~3s	~$0.01

Field	What it is	When to use
`imageUrl`	A 2×2 grid composite of all 4 images	Send as preview so user can see all options
`imageUrls[0]`	Image 1 (top-left)	Send when user wants image 1
`imageUrls[1]`	Image 2 (top-right)	Send when user wants image 2
`imageUrls[2]`	Image 3 (bottom-left)	Send when user wants image 3
`imageUrls[3]`	Image 4 (bottom-right)	Send when user wants image 4

Variable	Description
`FAL_KEY`	fal.ai API key (for Flux, Nano Banana, Ideogram, Recraft)
`LEGNEXT_KEY`	Legnext.ai API key (for Midjourney)

Image Gen

Image Generation Skill

🧠 Intelligent Dispatch Logic

Decision Tree

Image Gen

Image Generation Skill

🧠 Intelligent Dispatch Logic

Decision Tree

Model Capability Matrix

When to Use Nano Banana (Critical)

How to Use This Skill

Calling the Generation Script

⚡ Midjourney Workflow (Sync Mode — No --async)

Understanding Midjourney Output

Midjourney Interaction Flow

🤖 Nano Banana (Gemini) Workflow

Single image (no reference)

With reference images (character/scene consistency)

Nano Banana Output

Other Models

Flux Pro / Dev / Schnell

Ideogram v3

Recraft v3

Prompt Enhancement Tips

Example Conversations

Environment Variables

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api