Generate and review scientific figures using AI. Creates publication-quality figures aligned with manuscript content using Gemini for generation and Claude vision for review. Works well with submission-prep and manuscript-writing skills.
Generate, review, and align scientific figures with manuscript content. Uses Gemini for image generation and Claude's vision for quality review following a plan-generate-report pipeline.
Automated revision loops (generate → critique → re-generate) do not improve results. Gemini generates a new image each time — it cannot surgically fix specific elements. Iteration typically degrades quality or introduces new errors. Testing with both direct Gemini calls and multi-agent frameworks (PaperBanana) confirmed this: iteration 1 consistently outperforms iteration 2+.
Instead, invest effort in building a detailed, spatially-structured prompt. The first generation with a good prompt is almost always the best result.
Combine with: submission-prep for final figure specifications, manuscript-writing for figure references in text.
Gemini excels at conceptual and architectural diagrams — workflow figures, system architecture, taxonomy charts, process flows. These consistently come out publication-quality on the first try with detailed prompts.
Do NOT use Gemini for:
For multi-panel figures with mixed content (e.g., real tool output alongside diagrams), generate or capture each panel separately and compose them with Python Pillow. See "Composing Multi-Panel Figures" below.
Create a new figure from a text description.
python .claude/skills/scientific-figures/scripts/generate_figure.py \
"workflow diagram showing sample collection through bioinformatic analysis pipeline" \
--style scientific \
--output figures/fig1_workflow.png
Steps:
generate_figure.py with --style scientific for publication-quality defaultsIf the result has issues:
--input-image editing, which is unreliableEvaluate existing figures against publication standards.
Steps:
figures/*.png, figures/*.jpg, fig*.pngCheck that figures match manuscript content.
Steps:
The generate_figure.py script in scripts/ handles Gemini API calls:
# New figure with scientific style
python .claude/skills/scientific-figures/scripts/generate_figure.py \
"bar chart comparing gene expression across three conditions" \
--style scientific \
--output figures/fig2.png
# High-resolution for print
python .claude/skills/scientific-figures/scripts/generate_figure.py \
"phylogenetic tree of opsin gene family" \
--style scientific \
--size 2k \
--output figures/fig3.png
# Validate API key
python .claude/skills/scientific-figures/scripts/generate_figure.py --validate
Flags:
--style scientific: Prepends publication-quality instructions (white background, clean lines, sans-serif labels, colorblind-safe colors)--input-image: Provide existing image for multi-turn editing. Use sparingly — only for minor color/style tweaks. Never for text changes.--size 1k|2k|4k: Advisory target resolution (actual output resolution is model-controlled, typically ~1408x768)--output: Output file path (default: figure_TIMESTAMP.png)The script outputs JSON metadata to stderr with model used, prompt, timing, and success status.
Important notes:
fig1.png becomes fig1.jpg if Gemini returns JPEG).--size flag. The flag adds resolution hints to the prompt but Gemini may ignore them. Typical output is ~1408x768 at 300 DPI.Automated critique-and-revise loops sound appealing but fail in practice:
The --input-image editing flag is retained for rare cases where a minor color or style tweak is needed, but it should not be part of a standard workflow.
When reviewing figures (either generated or existing), check:
See REVIEW_CRITERIA.md for the full scored checklist.
Invest in prompt quality, not iteration. A detailed, well-structured prompt consistently produces better results than a vague prompt followed by multiple edit passes. The first generation with a good prompt is almost always the final figure.
The most effective pattern is to describe the figure section-by-section with explicit spatial layout:
A clean scientific diagram with three sections flowing left to right:
LEFT SECTION - 'Input': [detailed description of what appears here]
CENTER SECTION - 'Processing': [detailed description with specific labels,
field names, data values to include]
RIGHT SECTION - 'Output': [detailed description of output elements]
Use a white background, clean lines, sans-serif font. Blue accent color
for arrows and highlights. Publication quality.
This spatial structuring works for:
Good: "A diagram with three connected sections flowing left to right: LEFT SECTION - 'Plot Creation': Show a terminal window with the command 'gg(data).aes({x: "gene"}).render()' and a small plot below. CENTER SECTION - 'Automatic Persistence': Show a JSON document labeled 'Plot Specification' with fields: _provenance (id, timestamp, dataFile), spec (data, aes, geoms, scales). RIGHT SECTION - 'Search and Retrieval': Show three paths: browse by date, search by type, re-render at different dimensions. White background, blue accents, sans-serif labels."
Poor: "Make a figure for my methods section."
See PROMPT_TEMPLATES.md for reusable templates by figure type.
When you need a composite figure from separate images (e.g., multiple screenshots, or a mix of Gemini-generated and real output), use Python Pillow:
from PIL import Image
# Load panels
panels = [Image.open(f) for f in ["panel_a.png", "panel_b.png", "panel_c.png"]]
# Normalize to same height
target_h = 1000
resized = []
for img in panels:
ratio = target_h / img.height
resized.append(img.resize((int(img.width * ratio), target_h), Image.LANCZOS))
# Compose side by side
pad = 30
total_w = sum(p.width for p in resized) + pad * (len(resized) - 1)
canvas = Image.new("RGB", (total_w, target_h), (0, 0, 0)) # background color
x = 0
for p in resized:
canvas.paste(p, (x, 0))
x += p.width + pad
canvas.save("composite.png", dpi=(300, 300))
Background color tips:
The generation script requires a Google API key and the google-genai package:
pip install google-genai (not in the main requirements.txt since it's only needed for figure generation)GOOGLE_API_KEY environment variable, or add GOOGLE_API_KEY=your_key to a .env file in the project directory