Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).
Generate images using generate_media with mode="image". The system auto-selects the best backend based on available API keys.
# Simple text-to-image (auto-selects backend)
generate_media(prompt="A cat in space", mode="image")
# Specify backend and quality
generate_media(prompt="A logo for a coffee shop", mode="image",
backend_type="openai", quality="high")
# Batch generation (parallel)
generate_media(prompts=["sunset over ocean", "mountain landscape", "city at night"],
mode="image", max_concurrent=3)
| Backend | Default Model | Strengths | API Key |
|---|---|---|---|
| Google (priority 1) | gemini-3.1-flash-image-preview (Nano Banana 2) | Fast, flexible sizes, image editing, multi-turn | or |
GOOGLE_API_KEYGEMINI_API_KEY| OpenAI (priority 2) | gpt-5.4 | High quality, transparent backgrounds, continuation via response ID | OPENAI_API_KEY |
| Grok (priority 3) | grok-imagine-image | 1k resolution, continuation via stored data URI | XAI_API_KEY |
| OpenRouter (priority 4) | google/gemini-3.1-flash-image-preview | Access to multiple models via single API | OPENROUTER_API_KEY |
| Parameter | Description | Example |
|---|---|---|
prompt | Text description of the image | "A watercolor painting of mountains" |
backend_type | Force a specific backend | "google", "openai", "grok", "openrouter" |
model | Override default model | "gemini-3-pro-image-preview" for studio quality |
quality | Image quality (OpenAI) | "low", "medium", "high", "auto" |
size | Image dimensions | See backends reference |
aspect_ratio | Aspect ratio | "16:9", "1:1", "4:5" |
input_images | Source images for image-to-image editing | ["photo.jpg"] |
continue_from | Continuation ID for multi-turn editing | result["continuation_id"] |
Transform existing images by providing input_images:
generate_media(
prompt="Make it look like a watercolor painting",
mode="image",
input_images=["photo.jpg"]
)
Supported backends for image-to-image: Google (Gemini), OpenAI, Grok. The system auto-selects if your current backend doesn't support it.
Iteratively refine images using continue_from:
# First generation
result = generate_media(prompt="A logo for a coffee shop", mode="image")
# Refine using the continuation ID
result2 = generate_media(
prompt="Make the text larger and add a cup icon",
mode="image",
continue_from=result["continuation_id"]
)
Each backend uses a different continuation mechanism:
previous_response_id (stateless)Continuation only works for single image generation (not batch).
Google supports two API paths. Gemini (Nano Banana 2) is the default and recommended for most use cases. Imagen is only needed for advanced reference-image editing features.
gemini-*): generate_content() — text-to-image, image editing via input_images, multi-turn continuationimagen-*): generate_images() / edit_image() — text-to-image with negative_prompt/seed/guidance_scale, plus style transfer, control editing, and subject consistency via reference imagesFor studio-quality precision and text rendering, use: model="gemini-3-pro-image-preview" (Pro-tier).
extra_params reference: See references/extra_params.md