技能檔案

PostGen

Name: PostGen
Author: RajatVaghani

Generate social media posts as image carousels or video slideshows for Instagram, TikTok, Instagram Reels, and YouTube Shorts. Supports two output types — static image carousel (PNG slides) and video (MP4). Produces AI-generated background images, styled HTML slides, and final output. Use when the user asks to create a carousel, slideshow, social media post, Instagram post, TikTok, Reel, Short, content slides, or generate visual content for social media. Made by Claw HQ (https://openclawhq.app)

RajatVaghani0 星標2026年4月8日

職業
分類: 內容創作

技能內容

Generate polished social media posts from an idea. Supports two output types across multiple platforms.

Runtime

This skill is 100% Node.js. Every script is a .mjs file that runs with node. Do NOT use Python, curl, wget, or any other language/tool to perform PostGen operations. All image generation, rendering, video creation, file operations, and API calls are handled by the provided Node.js scripts. The only system dependency outside of Node is ffmpeg (used by the scripts internally).

Output Types and Platforms

PostGen generates two types of output. When the user asks for a post, determine which type and platform they need:

Image Posts (carousel / slideshow of static PNGs)

Swipeable carousel of individual slide images. Each slide is a separate PNG file.

Platform	Format name	Aspect Ratio	Resolution
Instagram carousel

相關技能

PostGen | Skills Pool

node <skill-path>/scripts/<script>.mjs <post-dir>

node <skill-path>/scripts/setup.mjs <workspace-path> --provider <google-genai|openai>

node <skill-path>/scripts/validate-config.mjs <workspace-path>

{
  "brand": {
    "name": "Acme Co",
    "primary_color": "#1e1b4b",
    "secondary_color": "#312e81",
    "accent_color": "#7c3aed",
    "font_family": "Inter",
    "tagline": "Build better, faster",
    "website": "https://acme.co"
  },
  "cta_links": [
    { "title": "Visit Website", "url": "https://acme.co" },
    { "title": "Download App", "url": "https://acme.co/download" },
    { "title": "Contact Us", "url": "https://acme.co/contact" }
  ],
  "cta_video": "assets/cta-outro.mp4"
}

ls -la <workspace-path>/assets/

node <skill-path>/scripts/next-post-dir.mjs <workspace-path>

{
  "output_type": "image",
  "formats": ["instagram", "tiktok"],
  "slides": [ ... ]
}

{
  "topic": "The main subject of your video",
  "video_provider": "gemini",
  "aspect_ratio": "9:16",
  "mode": "std",
  "template": "bold",
  "voiceover": true,
  "visual_style": "Cinematic commercial look. Warm golden color grade with soft shadows. Smooth slow camera movements, shallow depth of field.",
  "negative_prompt": "blurry, low quality, watermark",
  "reference_images": {
    "enabled": true,
    "subject_description": "A confident South Asian man in his late 20s with short styled dark hair and a trimmed beard, wearing a fitted navy henley shirt.",
    "reference_count": 3
  },
  "scenes": [
    {
      "scene_number": 1,
      "prompt": "A detailed visual description of the scene — include lighting, camera angle, colors, mood. Make it cinematic.",
      "duration": 8,
      "voiceover_text": "What the narrator says during this scene. Natural speech, not bullet points."
    },
    {
      "scene_number": 2,
      "prompt": "Second scene description...",
      "duration": 8,
      "voiceover_text": "Continuation of the narration..."
    }
  ],
  "cta": {
    "title": "Follow @YourBrand",
    "body": "Follow for more content like this!"
  }
}

topic: Brief description of the video's subject
video_provider: "gemini" (Veo 3.1, 8s clips), "kling" (Kling, 10s clips), or "grok" (Grok Imagine Video, 8s clips). Omit to auto-detect from available credentials (tries Gemini → Kling → Grok).
aspect_ratio: "9:16" for TikTok/Reels/Shorts, "16:9" for YouTube, "1:1" for Instagram Feed
mode: "std" (standard quality) or "pro" (higher quality, longer processing — Kling only)
template: Template for the CTA end-card: "bold", "neon", "minimal", "clean", "stack", "magazine", or "caption". Uses the same branded slide renderer as carousels.
voiceover: Enable TTS narration (reads voiceover_text from each scene). Defaults to true — voiceover runs unless you explicitly set "voiceover": false. Omitting this field = voiceover ON.
tts_provider: Do NOT set this field — the pipeline reads tts_provider from postgen.config.json automatically. Only include this field if the user explicitly asks to override the config for this specific post. Valid values: "openai", "elevenlabs", "gemini".
visual_style: Strongly recommended. A description of the consistent visual look applied to ALL scene prompts — color grading, lighting, camera style, and subject appearance (e.g. "Cinematic warm golden color grade, soft shadows. A confident man in his late 20s with short dark hair wearing a grey t-shirt. Smooth slow camera movements, shallow depth of field."). This gets automatically prepended to every scene prompt to ensure visual coherence across clips. Without this, each clip will have different actors, lighting, and styles. See video-content-guide.md for detailed guidance.
negative_prompt: Text to exclude from generation (e.g. "blurry, watermark, low quality")
reference_images: Controls image-guided video generation for dramatically better visual consistency across clips. When enabled, the pipeline generates reference images BEFORE video generation and passes them to the provider as first-frame images and/or character references. Supported by Gemini Veo and Grok Imagine Video. Defaults: If you omit this field entirely with Gemini or Grok provider, reference images are ENABLED by default (the pipeline will use visual_style as the subject description fallback). To explicitly disable, set "reference_images": {"enabled": false}. Fields:
- enabled: Boolean. Defaults to true when provider is "gemini" or "grok". Set false to use text-only mode. Ignored for Kling (Kling does not support reference images). Grok note: The Grok API cannot use first-frame and reference images on the same clip — when a first-frame is available for a scene it takes priority, otherwise character references are used.
- subject_description: Detailed description (aim for 30-60 words) of the main character/subject appearing across scenes. Include: age, gender, ethnicity/skin tone, hair (color, length, style), clothing (specific items and colors), build/body type. This description is used to generate 9:16 reference photos that Veo uses to maintain character identity across clips. If omitted, falls back to visual_style — but a dedicated subject description produces much better character consistency. For product-only videos (no person), describe the product instead: "A sleek matte-black wireless earbud case with rounded edges, silver hinge, and embossed logo. Studio lighting, white background." Example for person: "A confident South Asian man in his late 20s with short styled dark hair and a trimmed beard, wearing a fitted navy henley shirt. Athletic build, warm brown skin tone."
- reference_count: Number of character reference images to generate. Integer, 1-3. Default: 3. More refs = better consistency but more image API calls. Use 1 for fast iteration, 3 for final production.
scenes: Array of scene objects (up to 5 scenes), each with:
- scene_number: Sequential number (1, 2, 3, etc.)
- prompt: Rich, cinematic description of the visual. Include lighting, camera movement, mood, colors, composition. The more detailed, the better the output. NEVER include text, brand names, logos, or readable words in prompts — AI video CANNOT render text accurately. See video-content-guide.md for good/bad prompt examples.
- duration: Clip duration in seconds. Actual duration is fixed per provider (8s Gemini Veo, 10s Kling) — this field is informational for pacing voiceover text.
- voiceover_text: Natural narration for this scene. Match word count to clip duration (~3 words/sec): 20-24 words for Gemini (8s), 25-30 words for Kling (10s). Too few words = speech sounds draggy/slow. Too many = rushed. Brand mentions ARE allowed — voiceover is TTS audio, not AI video.
cta: Call-to-action end-card (branded 5s outro). Fields:
- title: Headline text (e.g. "Follow @YourBrand", "Save This Post")
- body: Supporting text (e.g. "Follow for more content like this!")

Use reference images when...	Skip reference images when...
Video features a recurring person/character	Video is abstract, landscape, or text-free motion graphics
Visual consistency across clips matters	Fast iteration / testing prompts quickly
Final production-quality output	Budget-constrained (saves 3-8 image API calls)
Subject identity must stay the same (same face, body, clothes)	Video is product-only with no specific subject to track

[Caption text here]

#hashtag1 #hashtag2 #hashtag3 ...

{post-dir}/
  slides.json                         ← content definition
  caption.txt                         ← social media caption + hashtags
  backgrounds/                        ← AI-generated background PNGs
  backgrounds-compressed/             ← ffmpeg-compressed JPGs
  instagram/                          ← per-format output
    html/                             ← rendered HTML slides
    final/                            ← final PNG slides + optional MP4
  tiktok/
    html/
    final/
  voiceover/                          ← TTS audio (if enabled)
    slide-1.mp3, slide-2.mp3, ...
    manifest.json
  ai-video/                           ← AI video clips (if ai_video enabled)
    clip-1.mp4, clip-2.mp4, ...
    manifest.json

{post-dir}/
  video.json                          ← content definition (scenes, cta, model)
  caption.txt                         ← social media caption + hashtags
  video-references/                   ← reference images for image-guided generation (Gemini / Grok)
    ref-1.png, ref-2.png, ref-3.png  ← character reference photos (shared across clips)
    scene-1.png, scene-2.png, ...    ← first-frame images (one per scene)
    manifest.json                     ← reference image metadata
  ai-video/                           ← AI video clips (Gemini Veo, Kling, or Grok)
    clip-1.mp4, clip-2.mp4, ...
    manifest.json                     ← clip metadata (includes image_guided flag)
  voiceover/                          ← TTS audio per scene
    scene-1.mp3, scene-2.mp3, ...    ← NOTE: "scene-N" not "slide-N"
    manifest.json                     ← segment metadata
  cta/                                ← branded CTA end-card
    frame.png                         ← rendered CTA slide
  final/                              ← composited output (NO format subdir!)
    postgen-video.mp4                 ← ONE file for all platforms

node <skill-path>/scripts/generate-post.mjs <post-dir>

node <skill-path>/scripts/status.mjs <post-dir>

Failure	Cause	Fix
`generate-backgrounds.mjs` fails for all slides	Bad API key or rate limit	Check key with `validate-config.mjs`, wait and retry
`generate-backgrounds.mjs` fails for some slides	Transient API errors	Re-run `generate-backgrounds.mjs` (it retries 3x automatically)
`render-slides.mjs` fails	Playwright/Chromium issue	Run `npx playwright install --with-deps chromium` in workspace
`generate-video.mjs` fails	ffmpeg missing or broken	Install ffmpeg, or use `--skip-video`
`compress-backgrounds.mjs` fails	ffmpeg missing	Install ffmpeg, or use `--skip-compress`
`generate-tts.mjs` fails	Missing TTS credentials	Add openai_api_key, elevenlabs_api_key, or gemini_api_key to config or env
`generate-ai-video.mjs` fails on video.json (Kling)	Kling API error, rate limit, or invalid scene prompt	Check Kling credentials and quota, verify scene prompts are detailed enough, retry
`generate-ai-video.mjs` fails on video.json (Gemini)	Gemini API error, quota exceeded, or safety filter	Check GEMINI_API_KEY, verify prompts don't trigger safety filters, retry
`generate-video-references.mjs` fails	Image API error or rate limit	Check image provider key, wait and retry. Or delete `video-references/` folder and re-run pipeline. Video gen will still work without refs (falls back to text-only mode automatically).
`generate-ai-video.mjs` fails (carousel flow)	Video provider API error or timeout	Check provider credentials, retry, or drop `ai_video` flag
`composite-video.mjs` fails	ffmpeg issue or missing clips	Check ai-video/manifest.json and voiceover/manifest.json exist, ensure ffmpeg works
Pipeline times out	Slow API or network	Increase timeout: `--timeout 900000` (15 min)

node <skill-path>/scripts/generate-backgrounds.mjs <post-dir>
node <skill-path>/scripts/compress-backgrounds.mjs <post-dir>
node <skill-path>/scripts/build-slides.mjs <post-dir> [--format instagram|tiktok] [--template bold|minimal|magazine|neon|stack|clean|caption]
node <skill-path>/scripts/render-slides.mjs <post-dir> [--format instagram|tiktok]
node <skill-path>/scripts/generate-video.mjs <post-dir> [--format instagram|tiktok]
node <skill-path>/scripts/generate-tts.mjs <post-dir> [--provider openai|elevenlabs|gemini] [--voice <voice-id>]
node <skill-path>/scripts/generate-video-references.mjs <post-dir>
node <skill-path>/scripts/generate-ai-video.mjs <post-dir> [--provider gemini|kling|grok] [--mode std|pro] [--no-refs]
  # --no-refs: Skip loading reference images, use text-only video generation even if reference images exist.
  #            Overrides video.json reference_images setting. Use for faster iteration or if refs are corrupt.
node <skill-path>/scripts/composite-video.mjs <post-dir> [--format tiktok] [--subtitle-style bold|minimal|karaoke]

node <skill-path>/scripts/verify-output.mjs <post-dir>

node <skill-path>/scripts/generate-post.mjs <post-dir> --dry-run

Script	Purpose
`setup.mjs`	Onboarding: create workspace, install deps (pinned versions), write config
`validate-config.mjs`	Pre-flight: check config, API keys, system deps — run before first generation
`next-post-dir.mjs`	Utility: prints and creates the next available post directory path
`status.mjs`	Utility: check pipeline progress for a post directory
`generate-backgrounds.mjs`	AI image generation with retry + timeout (Google GenAI or OpenAI)
`compress-backgrounds.mjs`	ffmpeg PNG-to-JPG compression (preserves full resolution)
`build-slides.mjs`	HTML slide generation from slides.json
`render-slides.mjs`	Playwright HTML-to-PNG rendering with font-loading wait
`generate-video.mjs`	ffmpeg PNG-to-MP4 basic carousel video (CRF 18, web-optimized)
`generate-tts.mjs`	TTS voiceover generation from video.json or slides.json (OpenAI, ElevenLabs, or Gemini Live API)
`generate-video-references.mjs`	Generate reference images for image-guided video (character refs + first-frames)
`generate-ai-video.mjs`	AI video dispatcher: resolves provider, loads reference images, delegates to providers/
`providers/gemini-video.mjs`	Gemini Veo 3.1 video (8s clips, supports first-frame + character reference images)
`providers/grok-video.mjs`	Grok Imagine Video (8s clips, supports first-frame OR character reference images)
`providers/kling-video.mjs`	Kling text-to-video (10s clips, JWT auth)
`composite-video.mjs`	Final video: stitch AI clips + voiceover audio + burned-in subtitles
`kling-client.mjs`	Kling API client: JWT auth, createTextToVideo, waitForTextToVideo, polling, download
`generate-post.mjs`	Full pipeline orchestrator with preflight, timeouts, and progress reporting
`verify-output.mjs`	Post-generation quality check: backgrounds, renders, content rules
`resolve-key.mjs`	API key resolution (image: OpenClaw→env→config; video: config→env→OpenClaw)
`workspace.mjs`	Shared utility: finds workspace root, resolves npm packages
`normalize-slides.mjs`	Shared utility: normalizes slides.json field names and structure

Platform	Format name	Aspect Ratio	Resolution
TikTok	`tiktok`	9:16	1080x1920
Instagram Reels	`tiktok`	9:16	1080x1920
YouTube Shorts	`tiktok`	9:16	1080x1920

PostGen

Runtime

Output Types and Platforms

Image Posts (carousel / slideshow of static PNGs)

PostGen

Runtime

Output Types and Platforms

Image Posts (carousel / slideshow of static PNGs)

Video Posts (MP4 slideshow)

Important

Important: How to Run Scripts

First-Time Setup (Onboarding)

Claw HQ Detection

Running Setup

Post-Setup: Validate Configuration

Post-Setup: Collect Brand Details

Generating a Post

MANDATORY: Ask These Questions First

MANDATORY: Scan Assets Before Every Post

Step 1: Determine the output folder

Step 2: Create the content file

Option A: Carousel Flow → create slides.json

Option B: AI Video Flow → create video.json

Step 2b: Generate caption & hashtags

Directory Structure (Strict — agents MUST follow these paths)

Step 3: Run the pipeline

Flow A: Carousel Pipeline (slides.json)

Flow B: AI Video Pipeline (video.json)

Managing Long-Running Generation

Step 3b: Checking Progress (if needed)

Step 4: Error Diagnosis and Recovery

Step 5: Verify output

Step 6: Deliver results

Dry Run (Pre-validation)

Script Reference

API Key Resolution

Image Generation Keys

Video Keys

TTS Keys

Providers

Image Generation

AI Video (Text-to-Video)

TTS (Voiceover)

Article Writing

Article Writing

Content Engine

Brand Voice

Article Writing

Article Writing

Option A: Carousel Flow → create `slides.json`

Option B: AI Video Flow → create `video.json`