Fetch captions from Instagram video URLs (Reels/posts) using yt-dlp, with fallback to audio transcription when subtitles are missing. Use when the user needs timestamps, transcript extraction, or caption files from Instagram videos for analysis, summarization, or translation.
Prefer a two-step pipeline: try embedded captions first, then transcribe audio.
python3 scripts/instagram_caption_pipeline.py "<instagram-url>" \
--mode original \
--format srt \
--output-dir ./captions \
--timed-text
--force-transcribe.--cookies-from-browser.--mode original to keep source language.--mode pt-br when the user asks for Portuguese-first subtitles.--transcribe-engine faster-whispersmall by default; switch to medium or large-v3 for harder audio.caption_file: primary caption file (.srt or .vtt) when subtitle tracks exist.timed_text_file: plain text [start --> end] text generated from caption file.timed_text_clean_file: deduplicated timed text (helps with auto-caption partials).audio_file: downloaded audio used for transcription fallback.transcript_file: plain transcript generated from audio.transcript_timed_file: timed transcript [start --> end] text from transcription segments.Caption-first with auto transcription fallback:
python3 scripts/instagram_caption_pipeline.py "<instagram-url>" \
--mode original \
--format srt \
--output-dir ./captions \
--timed-text
Force transcription (ignore available subtitle tracks):
python3 scripts/instagram_caption_pipeline.py "<instagram-url>" \
--force-transcribe \
--transcribe-engine faster-whisper \
--faster-whisper-model small \
--output-dir ./captions
Authenticated fetch using browser cookies:
python3 scripts/instagram_caption_pipeline.py "<instagram-url>" \
--cookies-from-browser chrome \
--output-dir ./captions
Install base dependency:
python3 -m pip install -U yt-dlp
For local transcription fallback:
python3 -m pip install -U faster-whisper
Optional system dependency (recommended for broader audio format support):
ffmpeg