Name: Content Extractor
Author: frankwen-jwah

Content Extraction Skill

What This Skill Does

Extracts content from multiple source types and saves as structured Markdown files. Supports:

/workspace/content_extractor.py

Extracts content from multiple source types and saves as structured Markdown files. Supports:

/workspace/content_extractor.py

Flag	Purpose
`--cookies-from-browser chrome`	Auth for member-only content
`--cookies FILE`	Auth via Netscape cookies.txt file
`--lang CODE`	Force subtitle language
`--prefer-auto`	Prefer auto-generated subs
`--no-chapters`	Flat transcript, no sections
`--include-description`	Add video/article description
`--dry-run`	Preview without downloading
`--overwrite`	Replace existing files
`--polish`	Claude cleanup
`--summarize`	Generate Pyramid/SCQA summary
`--no-whisper`	Disable Whisper audio fallback
`--whisper-model MODEL`	Whisper model size (default: base)
`--max-episodes N`	Max podcast episodes to extract
`--nitter-instance HOST`	Nitter instance for tweet extraction
`--no-speaker-notes`	Exclude PowerPoint speaker notes
`-o DIR`	Custom output directory
`-f FILE`	URL list file