Briefs + Gemini-rendered images using workspace brand-images/ (or BRAND_IMAGES_DIR) for QuizFactor/product logo & palette, merged with post copy and marketer/Drive context; copies to post-image.png / article-hero.png for HypeEngine.
Deliver designer-ready image specs: not a single vague prompt, but placement, ratio, prompt + negative, text overlay rules, and A/B variants—equivalent detail to how qf-course-researcher specifies Notion property types (Select vs rich_text) so downstream work does not degrade.
If OpenClaw already has Gemini (API key in ~/.openclaw/.env / ~/.config/gemini/api_key and/or host image_generate / equivalent tool), treat pixel generation as the default outcome, not optional:
concept.md, prompt-master.txt, etc.) as today.image_generate if exposed; else generateContent with an image-capable model per Gemini image docs) and save real file (e.g. ) into the same draft folder.generated-1.pnggemini-render.md.Only skip the render step if the human explicitly asked briefs only, the tool/key is missing after checking, or the API refuses (document in gemini-render.md).
When social-media-manager (or a writer) asks for a slot image, use this workspace contract so bundles and HypeEngine always know where to look:
| Context | Where to save the primary raster | Notes |
|---|---|---|
| Social campaign slot | workspace/drafts/social/<campaign>/posts/<post-id>/post-image.png (or .jpg) | Still write full brief + generated-* under workspace/drafts/images/<date>-<slug>/, then copy or symlink the chosen asset to post-image.png. Add posts/<post-id>/image-alt.txt (one line, ≤125 chars) for accessibility / HypeEngine. |
| LinkedIn long-form article | workspace/drafts/linkedin/<date>-<slug>/article-hero.png (or .jpg) | Hero for the article body + intern handoff; reference in article.md (e.g.  after title) and in README-handoff.md. |
| Teaser row (same article) | Reuse article-hero.png for the feed bundle when the calendar row points at the LinkedIn folder—or generate teaser-image.png in that folder if the teaser needs a distinct crop. |
Inputs for the image run: first line / hook from post-body.md or article thesis + title; platform from calendar.md (map to aspect ratio: X/LinkedIn feed often 1.91:1 or 1:1 per social-content-planning matrix).
Write post → generate image → attach → HypeEngine → schedule/publish.
post-body.md / teaser.md / article.md exists first (copy is source of truth for what to illustrate).post-image.png / article-hero.png + image-alt.txt.social-media-manager puts paths + alt into post-bundle.md; on publish, hype-engine uploads the file (Media API) and sets content[].media so the post is not text-only./posts with date + time from APPROVAL.md—the product schedules and publishes at that time; no extra “publish now” step (see hype-engine).brand-images/) — QuizFactor logo & product lookGenerated images must match the product and use your real brand, not a model-invented mark.
BRAND_IMAGES_DIR (e.g. /home/admin/.openclaw/workspace/brand-images on the OpenClaw server).workspace/brand-images/ relative to this workspace root (same filenames as on server).gemini-render.md with BRAND_KIT_MISSING and ask the human to add assets (see workspace/brand-images/README.md).On every run: list_dir / read palette.md, logo-usage.md, product-context.md (if present), and note available rasters (logo-primary.png, logo-mark.png, …).
Common failure: The model writes “QuizFactor” (or your product name) in a plain / display sans-serif in the top-left and treats it as the logo. That is not acceptable when logo-primary.png or logo-mark.png exists—only the official raster (or a composite of that file) counts. Download LinkedIn / profile marks once into those filenames; do not expect a URL or text prompt to substitute.
| Method | When to use |
|---|---|
| Post-process composite (default when PNG exists) | Generate scene with reserved empty top-left (no brand lettering—see step 4 under Logo placement), then overlay logo-mark.png / logo-primary.png with ImageMagick, sharp, etc. Log COMPOSITED_LOGO in gemini-render.md. |
| Reference image (single-pass) | Pass logo-*.png as multimodal inline_data. Instruction: “Only the attached image is the mark—top-left per logo-usage.md; do not redraw; do not add typographic wordmark.” |
| Text-only fallback | Only if no logo-*.png exists and logo-usage.md explicitly allows emergency typeset—otherwise stop or add PNGs first. |
| Never | No logo-from-memory; no wrong icon set; no “logo” that is only styled text of the company name when PNG assets exist; no media.licdn.com URLs in the API—save as logo-mark.png locally. |
When the team wants the real mark visible on every generated image (typical for LinkedIn/X):
logo-primary.png or logo-mark.png from brand-images/ (or BRAND_IMAGES_DIR). Files must be true RGBA PNGs—alpha channel outside the mark, no baked-in square panel of “lighter navy” (that is a bad export, not the hero). Do not paste remote URLs into prompts—download, re-export with transparency if you see a box in previews, then save. Preflight: Open the PNG on a checkerboard background; if a solid rectangle remains, fix the asset before any composite.logo-usage.md): top-left, inset ~2.5–4% of canvas from top and left; logo width ~8–14% of full image width (primary) or ~6–10% (mark). Keep clear space; main illustration (hand/phone/background figure) must not overlap the logo box.prompt-master.txt, pass the logo PNG as reference. Instruction: “Attached image is the only authorized mark—place in top-left; preserve alpha / transparency; do not redraw; do not put a square panel, matte, or second background behind it; do not add typographic wordmark.”prompt-master.txt, include “Top-left: empty reserved band (same navy as background), no brand text, no fake logo lettering, no contrasting square behind where a logo will sit.” Overlay the mark with alpha-respecting compositing (e.g. ImageMagick compose over / sharp composite with input alpha—not flattening to JPEG first). Use this when multimodal still returns text-as-logo or paints a logo matte box.safe-zones.md: Record logo lock rectangle (inset, max width) for composite math and LinkedIn crops.post-image.png / article-hero.png; log FIXED_TEXT_AS_LOGO. If a visible rectangle or wrong-tone slab sits behind the mark (model-added or opaque pixels in logo-*.png), fix the source PNG alpha and re-composite; log FIXED_LOGO_MATTE.Build the final generation brief in this order:
post-body.md (or article title + TL;DR)—this is what the illustration is about.00-intake.md to marketer-agent outputs (00-brief.md, 02-positioning.md) or USER.md—same source as the content creator; Google Drive (when used) is the long-form product source for marketer-agent / writers, not a substitute for renders. prompt-master.txt always layers three things: Style benchmark (canonical block in this skill), brand-images/ (logo + palette), and copy-derived subject—Drive-backed docs only add extra product truth for concept.md / positioning lines, not “brand instead of benchmark.”brand-images/: colors, logo rules, placement.visual-dna.md (campaign) for illustration style across the batch.prompt-master.txt on every run unless visual-dna.md explicitly replaces the illustration mode for that campaign.First post of a campaign: identical pipeline. If marketer-agent has not run yet, use post copy + product-context.md + brand-images/ + Style benchmark; after marketing pack exists, prefer 00-brief.md excerpts in concept.md.
If you keep image-generation.ts (or similar) under brand-images/ on the server, the skill still treats this folder as the asset source; wire execution through OpenClaw tools if you add a wrapper—the markdown workflow does not depend on TypeScript.
Default illustration system for social slots and LinkedIn article heroes: fixed composition + vector grammar below, merged with post-body.md / article.md so pixels match copy—not generic stock. This block is not optional when using the default pipeline: you use both the benchmark (hand + phone foreground; dynamic learner + device midground; floating icons + vector look) and brand-images/—Drive-only marketing docs never replace them.
Hard requirement for OpenClaw / any agent: Updating this SKILL.md does nothing to the image API by itself. Whoever calls image_generate / Gemini must read this skill and paste the STYLE LOCK + Canonical visual system into the actual prompt-master.txt (or equivalent single string). Improvising “IT certification hero art” from post copy alone will drift to sci‑fi/anime/holographic stock—that is a failure mode, not an acceptable shortcut.
The first paragraph of every prompt-master.txt (before the quoted canonical block) must be a short imperative block, verbatim in spirit—do not skip because the topic is “technical” or “IT certs”:
STYLE LOCK — NON-NEGOTIABLE: Flat 2D vector illustration only (clean corporate editorial / app-marketing vector), like Figma or Illustrator flats — NOT anime, NOT manga, NOT cinematic digital painting, NOT semi-realistic character art, NOT 3D render. NO holographic or floating sci-fi UI, NO cyberpunk, NO server room or data center, NO futuristic city skyline, NO cyan/teal neon glow as the dominant look, NO “tech command center” or tactical jumpsuit characters. NO large hero faces in painterly style. REQUIRED COMPOSITION: **hand holding smartphone** in foreground with simple quiz/lesson UI. **Mid/background:** a **second figure actively using learning tech**—**dynamic pose** (standing, seated at desk, walking with phone, subtle “yes!” / progress gesture, café perch, **or** relaxed lounge—**vary across posts; do NOT default to “always on a sofa”**); same quiz/lesson theme on **phone or tablet** in that figure’s hands. Small floating minimalist line icons (books, brain-in-lightbulb, checkmark, graduation cap) plus 1–2 topic icons from the post copy. REQUIRED PALETTE unless brand kit overrides: deep navy background, bright yellow accents, white linework. The ONLY “tech” is the phone/tablet screens — flat UI mockups, not glowing Blade Runner panels. **BRAND CORNER:** Do **not** draw the product name as typographic “logo” text in the top-left—**leave a clean navy reserve** for the **real `logo-mark.png` / `logo-primary.png`** (post-composite) or use the **attached logo image** only; **no fake wordmark lettering**; **no** lighter **square**, **plate**, or **panel** behind the mark—the logo must float on **transparent** pixels or true hero background only.
Then immediately follow with the Canonical visual system quote below (tokens permitting). If the API has a length limit, shorten the quote last—never delete the STYLE LOCK paragraph.
Use this block as the stable backbone of every prompt-master.txt (trim for token limits only; keep structure; never trim the STYLE LOCK paragraph):
Professional flat vector illustration for an educational quiz / learning product. Composition: A hand holding a smartphone in the foreground; the phone screen shows a quiz or lesson UI with multiple-choice-style buttons or clear lesson steps (not illegible micro-text). In the mid/background, show a learner in a dynamic, engaging pose (e.g. standing with tablet, seated at a simple desk with laptop/tablet, quick celebratory fist-up while glancing at phone, on-the-go with device—rotate the scenario; avoid using “person reclining on sofa” every time). Their screen echoes the same quiz/lesson theme. Floating around the scene, cute minimalist icons (same line weight): open books, lightbulbs with simple brain shapes, checkmarks, and a graduation cap—plus one or two extra icons tied to the post/article topic (see Content from copy below). Default color scheme: deep navy blue background, bright yellow accents, white linework and key highlights. Quality: clean, modern, corporate educational aesthetic, high resolution, 2D vector art (not photorealistic), generous negative space, crisp edges.
Goal: Energy and variety without breaking the flat vector + navy/yellow/white guide or drifting into anime/sci‑fi.
calendar.md rows do not reuse the same couch scene; note the chosen variant in concept.md one line (e.g. Background: standing + tablet).The benchmark is not a generic scene. Derive scene meaning from the source document:
| Source | Minimum extraction |
|---|---|
Social post-body.md | First hook line + must-win message + any named topic, course, or outcome in the ## Publish-ready block (or equivalent). |
LinkedIn article.md | Title + TL;DR or lede + first H2 theme (or strongest concrete example in the opening sections). |
On-screen text (phone / tablet): Where the model can render short UI text, use a paraphrase of the hook or thesis (≤8 words per line, ≤3 lines on the phone; no lorem ipsum). Examples: “Cut study time in half”, “CompTIA Security+ in 20 min/day”—tuned to this post. If the API cannot place text reliably, state in text-overlay.md and keep the visual metaphor (icons, scene) tightly aligned with the same nouns.
Floating icons: Keep the benchmark set (books, brain-lightbulb, checkmark, cap). Add or emphasize 1–2 motifs that mirror the copy (e.g. shield for security, cloud for cloud certs, chart for analytics).
Anti-generic check: If the image could apply to any edtech post without reading the copy, the prompt is too weak—inject specific nouns from the post into prompt-master.txt.
If brand-images/ (or BRAND_IMAGES_DIR) has palette.md / logo-usage.md:
Always merge these into negative-prompt.txt (or the same user prompt string for APIs without a separate negative field) for default QuizFactor social/article runs—especially when post copy mentions IT, AWS, security, coding, certifications, which otherwise pull holographic/sci‑fi defaults:
anime, manga, manhwa, cel shading, visual novel, cyberpunk, holographic interface, HUD, sci-fi control room, mission control room, server rack, data center, tactical suit, jumpsuit, futuristic city, skyline at night, neon cyan, electric blue glow, cinematic lighting, octane render, Unreal Engine, highly detailed face, semi-realistic portrait, digital painting, concept art, tech test evaluation, matrix code rain, Matrix-style
Also add when logo-*.png exists: typographic wordmark as logo, brand name spelled out as fake logo, styled company name text in corner, invented logotype lettering—the real mark must come from file + composite or multimodal attachment, not AI type. Add: solid square behind logo, logo on a panel or tile, lighter rectangle badge behind brand, opaque matte behind mark (force transparent mark on hero only).
Also as needed: photorealistic skin, wrong logo, unreadable tiny paragraphs, clutter competing with the phone hero, celebrity likeness, watermarks, generic unrelated quiz topic (if copy is about X, ban visual story about unrelated Y).
After saving generated-1.png, glance at the output (or read a vision description). If you see any of: anime face, holographic/blue sci-fi UI as the main subject, server room, cyberpunk city, painterly detail skin, or missing the foreground hand + smartphone with quiz UI and a secondary learner + device in the scene → regenerate once with the STYLE LOCK repeated twice (top + bottom of prompt) and negative-prompt.txt doubled; note REGENERATED_STYLE_DRIFT in gemini-render.md. Do not ship cyberpunk/holographic art as QuizFactor branded flats. Sofa/lounge is optional, not required for QA pass. Logo QA: If logo-*.png exists but the corner shows only typeset product name (no true mark), do not ship—apply COMPOSITED_LOGO or regenerate with reserved blank corner; log FIXED_TEXT_AS_LOGO. If the mark has a visible box or wrong slab, re-export logo PNG with alpha and re-composite; log FIXED_LOGO_MATTE.
Every run:
prompt-master.txt, then the canonical visual system block.post-body.md / article.md.palette.md, logo-usage.md) so logo and colors stay compliant; brand colors override the benchmark default palette when they conflict; keep vector + white-linework read unless brand docs forbid it.post-image.png / article-hero.png.If a campaign visual-dna.md explicitly defines a different illustration mode, follow visual-dna.md for that campaign—but still tie subject matter to the same copy rules above.
Generic “nice illustration” prompts drift. For every social-slot run:
## Publish-ready block (or article title + TL;DR + first H2 theme). The image must reflect specific nouns, metaphors, or outcomes in that copy—not an unrelated category stock scene. Use Style benchmark above for composition and vector grammar while injecting those specifics into on-device UI hints and scene metaphors (Content from copy).workspace/drafts/social/<campaign>/visual-dna.md exists, append its locked lines to every prompt-master.txt. Always merge after the brand kit block from brand-images/ (palette.md / logo-usage.md) so colors and logo rules are not contradicted. If visual-dna.md is missing, derive 3–5 style rules from USER.md, SOUL.md, and marketer-agent 03-messaging-pillars.md / 00-brief.md once per campaign and reuse them for all slots in that folder.calendar.md week unless the human asks otherwise; keep aspectRatio aligned to social-content-planning / calendar row so crops match HypeEngine/LinkedIn/X expectations.prompt-master.txt = [style DNA] + [subject tied to post] + [composition] + [lighting] + [what to avoid]. Fold **negative-prompt.txt` into the same generation call as today.image-alt.txt must describe what’s in the image and why it fits the post in one line; if you can’t write that, the image probably doesn’t match—regenerate with a tighter prompt.posts/<earlier-post-id>/image-alt.txt (not necessarily the binary) to keep tone and illustration mode consistent across the batch.USER.md, SOUL.md; brand-images/ per Brand kit section (env BRAND_IMAGES_DIR or workspace/brand-images/).workspace/drafts/images/<YYYY-MM-DD>-<slug>/
image_generate (or other host image tool) first if available; else generateContent with x-goog-api-key — see workspace/INTEGRATIONS.md.README-handoff.md.GEMINI_API_KEY in ~/.openclaw/.env or ~/.config/gemini/api_key. Send x-goog-api-key (not Bearer) to generativelanguage.googleapis.com — see workspace/INTEGRATIONS.md.generateContent): Map prompt-master.txt + aspect ratio from brief.json / matrix into generationConfig.imageConfig.aspectRatio ("1:1", "16:9", "9:16", etc.—use values supported by the model you choose). Parse the response for inline image bytes (base64) and write generated-1.png (or .jpg per MIME) under the same draft folder. Model IDs change — pick the current image-capable model from Gemini image generation docs.predict): Optional batch-style generation via e.g. imagen-4.0-generate-001:predict with instances[].prompt and parameters.sampleCount if your project uses Imagen instead of native image output.negative-prompt.txt into the user text prompt (Gemini image prompts are text-first); keep text-overlay.md as post-edit guidance if the API cannot place text reliably.Use case → aspect ratio matrix
Map explicitly:
| Placement | Ratio | Min resolution (guide) |
|---|---|---|
| IG feed | 4:5 or 1:1 | 1080 wide min |
| Stories/Reels cover | 9:16 | 1080×1920 |
| Meta feed ad | 1:1 / 4:5 | per Ads Manager |
| YT thumbnail | 16:9 | 1280×720 min |
| LinkedIn link post | 1.91:1 | 1200×627 typical |
Concept sheet (concept.md)
Master prompt (prompt-master.txt)
visual-dna.md overrides.Negative prompt (negative-prompt.txt)
watermark, lowres, blurry, extra fingers, competitor logos, fake App Store badge, gore, photorealistic named celebrity (unless rights cleared).visual-dna.md explicitly defines a different illustration mode.Text-on-image policy (text-overlay.md)
Safe zones (safe-zones.md)
logo-primary / logo-mark per Logo placement — top-left (inset + max width); for 9:16 add top/bottom UI overlay avoidance; for YT thumb: right third often occluded by timestamp—keep face/keyword left.A/B variants (variants.md)
Variant | What changed | Hypothesis | Prompt delta summary.Brand compliance block
README-handoff.md: checklist against USER.md (colors, banned motifs, disclosure if sponsored creative).Optional brief.json
aspect_ratio, width, height, prompt, negative_prompt, variants[].Optional pixel pass (gemini-render.md)
gemini-render.md next to the images.Scheduling
workspace/drafts/images/<YYYY-MM-DD>-<slug>/concept.mdprompt-master.txt, negative-prompt.txt, text-overlay.md, safe-zones.md, variants.md, README-handoff.mdgenerated-* file exists or failure/refusal documented in gemini-render.md.adverts-creator, list which variant maps to which ad headline in README.prompt-master.txt; full benchmark negatives included; Style benchmark reflected unless visual-dna.md overrides; subject + on-screen hints tied to specific post/article copy; mid/background figure uses a dynamic pose (Background figure — dynamic variants)—not sofa-by-default across a batch; Post-render visual QA passed (or one regenerate logged in gemini-render.md).visual-dna.md or equivalent style block reused across the campaign batch when present (and still copy-grounded).image-alt.txt reads true to the raster and to the post intent.brand-images/ loaded (BRAND_IMAGES_DIR or default); logo-*.png are RGBA with real transparency (checkerboard test); final pixel uses COMPOSITED_LOGO (preferred) or verified multimodal—not text-as-logo; no visible matte rectangle behind the mark (fix asset or prompt); no CDN URLs as “logo.”