Speech-to-text transcription using Groq Whisper API. Supports multiple formats, batch processing, and subtitle output.
Speech-to-text using Groq Whisper API. Fast, accurate, with multiple output formats.
# Save API key (once)
./transcribe.mjs --config "your-groq-api-key"
# Or use environment variable
export GROQ_API_KEY="your-key"
# Simple transcription
./transcribe.mjs meeting.m4a
# Save to file
./transcribe.mjs podcast.mp3 --output transcript.txt
# JSON output with timestamps
./transcribe.mjs interview.wav --format json
# Generate subtitles
./transcribe.mjs video.mp4 --format srt --output video.srt
./transcribe.mjs video.mp4 --format vtt --output video.vtt
| Option | Description |
|---|---|
--format <format> | Output: text (default), json, srt, vtt |
--output <file> | Output file (default: stdout) |
--model <model> | whisper-large-v3-turbo (default) or whisper-large-v3 |
--language <code> | Language code (auto-detect if not set) |
--translate | Translate to English |
--batch <file> | Process multiple files (one per line) |
--parallel <n> | Parallel batch jobs (default: 2) |
--verbose | Show progress |
--config <key> | Save API key to config |
Create a file list:
cat > files.txt << 'EOF'
meeting_2024_01.m4a
meeting_2024_02.m4a
meeting_2024_03.m4a
EOF
Process all files:
./transcribe.mjs --batch files.txt --parallel 3 --verbose
./transcribe.mjs --batch files.txt --format srt --output-dir ./subtitles/
Common codes: en, de, fr, es, it, pt, nl, ja, ko, zh
# German audio
./transcribe.mjs aufnahme.m4a --language de
# Auto-detect (usually accurate)
./transcribe.mjs unknown.mp3
# Transcribe and translate to English
./transcribe.mjs german.m4a --translate
./transcribe.mjs japanese.mp3 --translate --format json
Plain text with punctuation and capitalization.
Hello, this is a test of the transcription system.
It handles multiple sentences and proper formatting.
Full response with segments, timestamps, and metadata.
{
"text": "Hello world",
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "Hello world"
}
]
}
Standard subtitle format for video players.
1