Transcribe
Use when the user wants to transcribe audio or video, extract speech to text, or label speakers with optional diarization; prefer the bundled `scripts/transcribe_diarize.py` and require `OPENAI_API_KEY`.
hebertzhu0 starsMar 27, 2026 - Occupation
- Categories
- Media
Intent
- Use for converting recordings into text, optional speaker diarization, and structured transcript output for meetings, interviews, or media assets.
Default operating pattern
- Confirm the audio source, expected output format, and whether the user needs plain text or diarized output.
- Collect any hints that materially improve recognition quality: language, known speaker names, or reference audio.
- Prefer the bundled
scripts/transcribe_diarize.py so the workflow remains deterministic and reusable.
- Start with the simplest successful output, then add diarization or richer structure only when the user actually needs it.
- Validate transcript quality, speaker labels, and segment boundaries before calling it done.
Bundled helper
scripts/transcribe_diarize.py supports transcription plus optional diarization with OpenAI audio models.
Pack fit