Name: Generate Podcast
Author: RonanCodes

搵技能.../

Generate Podcast | Skills Pool

wiki pages  →  LLM script writer  →  script.md  →  TTS per line  →  ffmpeg concat  →  podcast.mp3

HAS_FFMPEG=0; HAS_PIPER=0
which ffmpeg >/dev/null 2>&1 && HAS_FFMPEG=1
which piper  >/dev/null 2>&1 && HAS_PIPER=1

if [ "$HAS_FFMPEG" = "0" ]; then
  echo "ffmpeg missing. Installing via Homebrew…"
  brew install ffmpeg
fi

# Piper is optional if ELEVENLABS_API_KEY or OPENAI_API_KEY is set.
if [ "$HAS_PIPER" = "0" ] && [ -z "$ELEVENLABS_API_KEY" ] && [ -z "$OPENAI_API_KEY" ]; then
  echo "Piper not found and no cloud TTS key present."
  echo "Installing Piper (local, free, robotic-but-serviceable)…"
  brew install piper-tts 2>/dev/null || {
    echo "Homebrew install failed. See https://github.com/rhasspy/piper for manual install."
    exit 1
  }
fi

mapfile -t PAGES < <(.claude/skills/generate/lib/select-pages.sh "$VAULT_DIR" "$TOPIC")

HASH=$(.claude/skills/generate/lib/source-hash.sh "${PAGES[@]}")

# Podcast: {{topic}}

_Length target: {{length}} (~{{minutes}} min)._

[HOST]: Welcome. Today we're talking about {{topic}}. Here's why that matters…
[HOST]: First, the basics. According to {{cite: wiki/concepts/attention.md}}, attention is…
[HOST]: …

# Podcast: {{topic}}

[A]: Alright, let's get into {{topic}}.
[B]: Why this, why now?
[A]: Because {{cite: wiki/concepts/rag.md}}…
[B]: Huh. I thought…
[A]: Right, but here's the nuance…

Priority	Backend	Trigger	Cost	Quality
1	ElevenLabs	`ELEVENLABS_API_KEY` set	~$0.30 per 1k chars	Studio-grade
2	OpenAI TTS	`OPENAI_API_KEY` set	~$0.015 per 1k chars	Very good
3	Piper (local)	always available once installed	free	Robotic but clean

if [ -n "$ELEVENLABS_API_KEY" ]; then
  TTS_BACKEND="elevenlabs"
elif [ -n "$OPENAI_API_KEY" ]; then
  TTS_BACKEND="openai"
else
  TTS_BACKEND="piper"
fi

# Piper example
echo "$LINE_TEXT" | piper \
  --model "$VOICE_MODEL" \
  --output_file "/tmp/podcast_${i}.wav"

# build a concat list
for w in /tmp/podcast_*.wav; do echo "file '$w'" >> /tmp/podcast_list.txt; done

# render MP3
ffmpeg -f concat -safe 0 -i /tmp/podcast_list.txt \
  -codec:a libmp3lame -qscale:a 2 \
  "$VAULT_DIR/artifacts/podcast/<slug>-<date>.mp3"

ARTIFACT_TYPE="podcast"
EXISTING=$(ls "$VAULT_DIR/artifacts/$ARTIFACT_TYPE/"*"$TOPIC_SLUG"*.meta.yaml 2>/dev/null | sort | tail -1)
if [ -n "$EXISTING" ]; then
  PREV_VERSION=$(grep '^version:' "$EXISTING" | awk '{print $2}')
  PREV_VERSION=${PREV_VERSION:-1}
  VERSION=$((PREV_VERSION + 1))
  PREV_SLUG=$(basename "$EXISTING" .meta.yaml)
else
  VERSION=1
  PREV_SLUG=""
fi

META="${MP3_OUT%.mp3}.meta.yaml"
cat > "$META" <<EOF

Generate Podcast

Usage (via /generate router)

Pipeline

Generate Podcast

Usage (via /generate router)

Pipeline

Step 1: Dependency Check

Step 2: Resolve Vault + Topic

Step 3: Compute Source Hash

Step 4: Write the Script

Single-host monologue (default)

Two-voice dialogue (`--two-voice`)

Step 5: TTS Backend Selection

Voice selection

Step 6: Render Each Line

Step 7: Concatenate with ffmpeg

Step 8: Version Detection

Step 9: Write the Sidecar

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api

Generate Podcast

Usage (via /generate router)

Pipeline

Generate Podcast

Usage (via /generate router)

Pipeline

Step 1: Dependency Check

Step 2: Resolve Vault + Topic

Step 3: Compute Source Hash

Step 4: Write the Script

Single-host monologue (default)

Two-voice dialogue (--two-voice)

Step 5: TTS Backend Selection

Voice selection

Step 6: Render Each Line

Step 7: Concatenate with ffmpeg

Step 8: Version Detection

Step 9: Write the Sidecar

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api

Two-voice dialogue (`--two-voice`)