Generate and edit images using the Gemini Image Generation API with provider fallback. Supports text-to-image, image editing, multi-image composition (up to 14 input images), aspect ratio control, and 1K/2K/4K resolution. Automatically falls back to alternate providers on failure. Use when the user asks to create, generate, draw, or edit images.
Generate and edit images via the bundled Python script with automatic provider fallback.
uv run <skill_dir>/scripts/generate_image.py -p "description" -f "name.png"
With aspect ratio and resolution:
uv run <skill_dir>/scripts/generate_image.py \
-p "cherry blossoms in spring breeze" -f "blossoms.png" -a 9:16 -r 2K
Edit an existing image:
uv run <skill_dir>/scripts/generate_image.py \
-p "make the sky purple" -f "purple-sky.png" -i /path/to/input.png
Compose multiple images (up to 14):
uv run <skill_dir>/scripts/generate_image.py \
-p "combine into a unified scene" -f "combined.png" -i img1.png -i img2.png
| Param | Required | Description |
|---|---|---|
| -p, --prompt | Yes | Image description or editing instruction |
| -f, --filename | Yes | Output filename. Timestamp prefix auto-added |
| -r, --resolution | No | 1K (default), 2K, or 4K. Auto-detected from input images |
| -a, --aspect-ratio | No | 1:1 2:3 3:2 3:4 4:3 4:5 5:4 9:16 16:9 21:9 |
| -i, --input-image | No | Input image path. Repeatable, up to 14 |
| -m, --model | No | Override model ID (bypasses provider config) |
| --base-url | No | Override API endpoint (single-provider mode) |
| -k, --api-key | No | Override API key (single-provider mode) |
| --config | No | Path to provider chain config file |
By default the script runs in single-provider mode using GEMINI_API_KEY (+ optional GEMINI_BASE_URL). To enable automatic fallback across multiple providers, point the script at a config file via --config or the GEMINI_IMAGE_CONFIG env var.
{
"providers": [
{
"name": "provider-label",
"base_url": "https://proxy.example.com/gemini",
"api_key_env": "ENV_VAR_NAME_FOR_API_KEY",
"model": "gemini-3.1-flash-image"
},
{
"name": "google-direct",
"base_url": null,
"api_key_env": "GOOGLE_GEMINI_API_KEY",
"model": "gemini-3.1-flash-image"
}
]
}
api_key_env: environment variable name (not the key itself) — file is safe to commitbase_url: null: use Google's official endpoint directly-k, --base-url) bypass the chain entirelyWhen using this skill and no GEMINI_IMAGE_CONFIG is configured yet:
GEMINI_IMAGE_CONFIG to env.vars in ~/.openclaw/openclaw.json — all agents pick it up via exec injectionexport GEMINI_IMAGE_CONFIG=/path/to/config.json in shell profileWithout a config file, the script falls back to single-provider mode using GEMINI_API_KEY (+ optional GEMINI_BASE_URL), which requires no setup beyond setting those env vars.
429 500 502 503, timeout, connection errors400 bad request, safety filters, invalid params401/403): skip provider, try nextMEDIA: line for auto-attachment on supported chat platformschevereto-upload skill after generationPlain filename → $GEMINI_IMAGE_OUTPUT_DIR/YYYY-MM/timestamp-name.png
| Variable | Required | Description |
|---|---|---|
GEMINI_API_KEY | Yes* | Primary provider API key |
GOOGLE_GEMINI_API_KEY | No | Fallback provider API key |
GEMINI_BASE_URL | No | Primary endpoint (fallback mode only) |
GEMINI_IMAGE_OUTPUT_DIR | No | Output directory (default: ~/.openclaw/workspace/images) |
GEMINI_IMAGE_CONFIG | No | Path to providers.json override |
*Required unless --api-key is provided or providers.json has a valid provider.
See references/prompting.md for templates and best practices.
Key principle: describe the scene as a narrative paragraph, not a keyword list.
| Error | Cause | Fix |
|---|---|---|
No providers available | No API keys set | Set env vars or use -k |
All providers failed | Every provider returned errors | Check API keys and model names |
No image was generated | API returned text-only | Rephrase prompt; may trigger safety filters |
| First run slow (~10s) | uv downloading dependencies | Subsequent runs use cache |