Name: Gemini
Author: JerichoCruz

Buscar habilidades.../

Gemini | Skills Pool

Auto-detect dimensions of the appropriate source image:

file "/path/to/body_image.png"
# Returns: ... 1080x799 ...

Calculate ratio (width ÷ height) and map to closest supported:

Calculated Closest Flag
~1.0 Square --aspect 1:1
~1.33 Landscape --aspect 4:3
~0.75 Portrait --aspect 3:4
~1.78 Widescreen --aspect 16:9
~0.56 Vertical --aspect 9:16
~2.33 Ultrawide --aspect 21:9
Include in structured breakdown - show the detected ratio and confirm with user.

Body image: astronaut.jpg (1920x1080) → 16:9 widescreen
→ Using --aspect 16:9 to match body image dimensions

## Request Analysis

**Capability**: [detected capability]

### Images Identified
| Role | File | Purpose |
|------|------|---------|
| [role] | `[filename]` | [what it provides] |

### Generated Prompt
[The full prompt that will be sent to the API]

### Parameters
- Model: `[model name]`
- Aspect: [ratio]
- [other relevant params]

### Command Preview
python3 .claude/skills/gemini/scripts/generate_image.py "[prompt]" [flags]

---
**Ready to generate?** Reply "yes" to proceed, or tell me what to change.

## Request Analysis

**Capability**: Face Swap

### Images Identified
| Role | File | Purpose |
|------|------|---------|
| Face source | `photo1.jpg` | Face to transfer |
| Body target | `astronaut.jpg` | Body/pose to use |

### Generated Prompt
{
  "task": "Transfer face from Image 1 onto astronaut body in Image 2",
  "from_image_1": {
    "extract": ["face", "facial features", "expression", "skin tone"]
  },
  "from_image_2": {
    "keep": ["astronaut suit", "pose", "space background", "helmet framing"]
  },
  "integration": {
    "neck_blend": "seamless",
    "lighting": "match harsh space lighting from Image 2"
  }
}

### Parameters
- Model: `gemini-3-pro-image-preview` (recommended for face swaps)
- Aspect: 1:1

### Command Preview
python3 .claude/skills/gemini/scripts/generate_image.py '{...}' --image photo1.jpg --image astronaut.jpg --model gemini-3-pro-image-preview

---
**Ready to generate?** Reply "yes" to proceed, or tell me what to change.

# Generate new image
python3 .claude/skills/gemini/scripts/generate_image.py "a cute robot"

# Edit existing image (conversational)
python3 .claude/skills/gemini/scripts/generate_image.py "add a red hat" --image ./generated_images/robot.png

# High quality with Pro model
python3 .claude/skills/gemini/scripts/generate_image.py "detailed portrait" --model gemini-3-pro-image-preview --thinking 16384

# User: "Make it brighter"
python3 .claude/skills/gemini/scripts/generate_image.py "make the image brighter with more sunlight" \
  --image ./generated_images/previous_image.png

# User: "Add a coffee cup"
python3 .claude/skills/gemini/scripts/generate_image.py "add a steaming coffee cup to the scene" \
  --image ./generated_images/previous_image.png

User: Generate a robot
Claude: [generates robot, saves to generated_images/gemini_image_20251231_xxx.png]
        Here's your robot! Would you like any changes?

User: Make it hold a coffee cup
Claude: [uses --image flag with previous image]
        python3 ... "add a coffee cup to the robot's hands" --image ./generated_images/gemini_image_20251231_xxx.png
        Done! The robot now has a coffee cup. Anything else?

User: Change background to sunrise
Claude: [uses --image with the updated image]
        Updated with a sunrise background!

User Says	Mode	Command
"Generate a cat"	New	`python3 ... "a cat"`
"Create an image of..."	New	`python3 ... "..."`
"Make it brighter"	Edit	`python3 ... "brighter" --image <previous>`
"Add a hat"	Edit	`python3 ... "add hat" --image <previous>`
"Change the background"	Edit	`python3 ... "change bg" --image <previous>`
"Try again" / "New version"	New	`python3 ... "<same prompt>"`
"Make A look like B"	Reference	`python3 ... "..." --image A.png --image B.png`

# Pose/expression transfer
python3 .claude/skills/gemini/scripts/generate_image.py \
  "Make the person in the first image adopt the pose and expression of the person in the second image" \
  --image subject.png --image reference.png

# Style transfer
python3 .claude/skills/gemini/scripts/generate_image.py \
  "Apply the artistic style of the second image to the first image" \
  --image photo.png --image artwork.png

# Identity swap
python3 .claude/skills/gemini/scripts/generate_image.py \
  "Put person A's face on person B's body" \
  --image personA.png --image personB.png

Category	Examples	Best Model
Photorealism	Headshots, film aesthetics, era-specific portraits	Pro
E-commerce	Product shots, virtual try-on, lifestyle photography	Pro
Social Media	Thumbnails, viral covers, meme generation	Flash → Pro
Interior Design	Floor plan → renders, room visualization	Pro
Education	Infographics, memory palace, concept visualization	Flash
Photo Editing	Outpainting, crowd removal, background replacement	Pro
Creative	Recursive images, aging effects, Droste effect	Pro
Workplace	Whiteboard → flowchart, UI sketch → prototype	Flash → Pro
Face Swaps	Face-on-body, identity transfer, head swap	Pro
Translation	Sign translation, comic localization	Pro
Avatars	3D blind box, pet memes, stylized portraits	Flash

Use Case	Model	Why
Rapid iterations, drafts	`gemini-2.5-flash-image`	Fast (2-5s), lower cost
Final output, quality	`gemini-3-pro-image-preview`	Superior quality, 2K
Text-heavy images	`gemini-3-pro-image-preview`	Best typography
Complex compositions	`gemini-3-pro-image-preview`	Better reasoning with `--thinking`
High volume	`gemini-2.5-flash-image`	Lower cost, faster

Ratio	Resolution	Best For	Token Cost
`1:1`	1024x1024	Icons, Instagram, squares	Lowest
`16:9`	1344x768	YouTube thumbnails, widescreen	Medium
`9:16`	768x1344	TikTok, Reels, Stories	Medium
`4:3`	1184x864	Presentations	Medium
`3:4`	864x1184	Portraits	Medium
`21:9`	1536x672	Cinematic, ultra-wide	Higher

Budget	Use Case	When to Use
`4096`	Quick	Simple prompts, fast iterations
`8192`	Balanced	Default, most use cases
`16384`	Complex	Detailed compositions, multiple subjects
`32768`	Maximum	Challenging requests, precise text rendering

# Complex scene with maximum thinking
python3 .claude/skills/gemini/scripts/generate_image.py \
  "A detailed medieval marketplace with merchants, customers, and goods" \
  --model gemini-3-pro-image-preview \
  --thinking 16384

Parameter	Values	Default	Description
`--model`	gemini-2.5-flash-image, gemini-3-pro-image-preview	gemini-2.5-flash-image	Model
`--image`	file path or URL	none	Image(s) for editing/reference. Supports local paths and URLs. Use multiple times.
`--aspect`	1:1, 16:9, 9:16, 4:3, 3:4, 21:9	1:1	Aspect ratio
`--count`	1-4	1	Number of images
`--output`	directory	./generated_images	Output base directory
`--project`	string	none	Project subfolder. Creates: `output/YYYY-MM-DD/project/`
`--name`	string	gemini_image	Filename prefix
`--retries`	number	3	Rate limit retry attempts
`--thinking`	4096-32768	8192	Thinking budget (Pro only)

{
  "success": true,
  "prompt": "the prompt used",
  "model": "gemini-2.5-flash-image",
  "mode": "generate | edit | reference",
  "source_images": [],
  "aspect_ratio": "1:1",
  "thinking_budget": null | 8192,
  "images": ["path/to/output.png"],
  "text_response": "optional model commentary",
  "count": 1
}

Original	Enhanced
"a cat"	"A fluffy orange tabby cat lounging on a velvet cushion, soft window light, shallow depth of field, professional pet photography, warm tones"
"robot"	"A friendly humanoid robot with glowing blue eyes, sleek white and silver design, soft studio lighting, futuristic but approachable, 8K detail"

{
  "subject": {
    "description": "A young professional woman",
    "age": "early 30s",
    "expression": "confident, approachable",
    "clothing": "navy business suit, white shirt"
  },
  "photography": {
    "style": "professional headshot",
    "camera": "Sony A7III, 85mm f/1.4",
    "lighting": "three-point studio lighting with soft key light"
  },
  "background": {
    "setting": "solid dark gray studio backdrop",
    "effects": "subtle vignette, slightly lighter behind subject"
  },
  "quality": "8K, natural skin texture with visible pores, catchlights in eyes"
}

{
  "face": {
    "preserve_original": true,
    "instruction": "Keep facial features 100% accurate from reference image"
  },
  "changes": {
    "clothing": "professional navy suit with white shirt",
    "background": "clean studio backdrop",
    "lighting": "soft professional lighting"
  },
  "do_not_modify": ["facial features", "expression", "face shape"]
}

Era	Key Elements
1990s flash	Harsh direct flash, slight overexposure, party vibes, disposable camera look
2000s digital	Early digital artifacts, slight noise, MySpace aesthetic, harsh flash
Kodak Portra 400	Warm skin tones, soft grain, nostalgic, natural colors
Fuji Superia	Cooler tones, punchy colors, subtle green cast

python3 .claude/skills/gemini/scripts/generate_image.py \
  "Expand to 16:9 aspect ratio. Seamlessly extend the scenery on both left and right sides. Match the original lighting, weather, and texture perfectly. Complete any cut-off objects naturally." \
  --image ./cropped_photo.png \
  --aspect 16:9

python3 .claude/skills/gemini/scripts/generate_image.py \
  "Remove all tourists/people in the background behind the main subject. Replace them with realistic background elements that fit the scene. Ensure no blur artifacts remain." \
  --image ./crowded_photo.png

python3 .claude/skills/gemini/scripts/generate_image.py \
  "Recursive image of an orange cat sitting in an office chair holding an iPad. On the iPad is the same cat in the same scene holding the same iPad. Repeated on each iPad." \
  --model gemini-3-pro-image-preview --thinking 16384

python3 .claude/skills/gemini/scripts/generate_image.py \
  '{
    "task": "Using the garment from Image 1 and the model from Image 2, create a realistic full-body fashion photo",
    "fit_details": "Garment must drape naturally, creating realistic folds and wrinkles",
    "preservation": "Preserve original fabric texture, color, and logos with extreme accuracy",
    "integration": "Match ambient lighting, color temperature, and shadow direction",
    "style": "Clean e-commerce lookbook, Canon EOS R5, 50mm f/1.8"
  }' \
  --image garment.png --image model.png \
  --model gemini-3-pro-image-preview

python3 .claude/skills/gemini/scripts/generate_image.py \
  "Convert this hand-drawn whiteboard sketch into a professional corporate flowchart. Use minimalist McKinsey-style aesthetic: clean lines, ample whitespace, blue-and-gray palette. Align all boxes to a strict grid. Connect with straight orthogonal arrows. Transcribe labels into bold sans-serif font." \
  --image whiteboard_sketch.png

python3 .claude/skills/gemini/scripts/generate_image.py \
  '{
    "layout": "promotional poster",
    "text_elements": {
      "title": {"content": "Autumn Special", "style": "elegant gold serif, top center"},
      "offer": {"content": "Buy One Get One Free", "style": "modern badge/sticker"},
      "footer": {"content": "Limited Time Only", "style": "small clean text"}
    },
    "background": "cinematic close-up of steaming cappuccino, rustic wooden table, autumn leaves",
    "quality": "ensure all text is perfectly spelled, centered, and integrated"
  }' \
  --model gemini-3-pro-image-preview --thinking 16384

python3 .claude/skills/gemini/scripts/generate_image.py \
  '{
    "task": "Place face from Image 2 onto body in Image 1",
    "from_image_1": {"keep": ["body", "pose", "clothing", "scene"]},
    "from_image_2": {"extract": ["face", "expression", "skin tone"]},
    "integration": {"neck_blend": "seamless", "lighting": "match Image 1"}
  }' \
  --image body_template.png --image face_source.png \
  --model gemini-3-pro-image-preview

Error	Solution
"No .env file found"	Create `.env` with `gemini_api = YOUR_KEY`
"Rate limited"	Script auto-retries; wait if persists. Check quotas above.
"Bad request"	Revise prompt (content policy)
"Image not found"	Check --image path exists

python3 .claude/skills/gemini/scripts/generate_image.py "test image" --model [new-model-name]

File	Purpose	When to Read
`SKILL.md`	Workflow, operations	Always (skill trigger)
`config.md`	Models, limits, params	When checking current values
`capabilities.md`	Capability detection, failure fixes	Complex operations (face swap, style transfer)
`examples/prompts.md`	JSON prompt templates	When user needs specific prompt structure
`examples/advanced-techniques.md`	Deep-dive guides	Human learning, complex troubleshooting

Images	Likely Capabilities
0	Text-to-Image, Text Rendering
1	Image Editing, Object Removal/Addition, Background Replacement, Outpainting
2+	Face Swap, Pose Transfer, Style Transfer

Capability	Default Aspect Source
Face Swap	Body/target image
Pose Transfer	Pose reference image
Style Transfer	Content image
Background Replacement	Subject image

Model	Requests/Day (RPD)	Requests/Min (RPM)
`gemini-2.5-flash-image`	~100	~15
`gemini-3-pro-image-preview`	~10	~5-10

Type	Use Case
Face-on-Body	Put your face on a template/model
Body-on-Face	Keep identity, change outfit/pose
Full Identity Transfer	Person B in Person A's scenario
Head Swap	Different hairstyle (includes hair)

What Changed	Update
Model names, limits, params	`config.md` (primary), then `generate_image.py` if endpoints changed
New capabilities	`capabilities.md`
Workflow changes	`SKILL.md`

Calculated	Closest	Flag
~1.0	Square	`--aspect 1:1`
~1.33	Landscape	`--aspect 4:3`
~0.75	Portrait	`--aspect 3:4`
~1.78	Widescreen	`--aspect 16:9`
~0.56	Vertical	`--aspect 9:16`
~2.33	Ultrawide	`--aspect 21:9`

Gemini

Gemini Image Generation

Prerequisites

Interactive Workflow (Default)

Gemini

Gemini Image Generation

Prerequisites

Interactive Workflow (Default)

Step 1: Detect Capability

Step 2: Clarify (Always Ask)

Aspect Ratio Handling

Step 3: Confirm (Structured Breakdown)

Step 4: Refine or Execute

Bypass Mode

Bypass Triggers

Bypass Behavior

Examples

Example Interaction

Quick Start

Conversational Workflow

New Generation

Editing (The Key Feature)

When to Use Each Mode

Reference Images (Style/Pose Transfer)

Use Cases

Model Selection

Aspect Ratio Guide

Thinking Budget (Pro Model Only)

Rate Limits & Quotas

Parameters

Output Format

Prompt Enhancement

Prompt Engineering Best Practices

JSON-Structured Prompts (Recommended)

Face Preservation Pattern

Era-Specific Aesthetics

Advanced Techniques

Composition Rescue (Outpainting)

Smart Inpainting (Object/Crowd Removal)

Recursive/Droste Effect

Virtual Model Try-On

Whiteboard to Professional Diagram

Text-Heavy Image Generation

Face & Identity Swaps

Anti-Patterns to Avoid

Error Handling

Examples

Updating This Skill

1. Fetch Current Documentation

2. Compare and Update

3. Update Files

4. Verify

File Reference

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api