When to Use

User wants to transcribe an audio file to text
User provides an audio file path and asks for transcription
User says "转录", "识别", "transcribe", "语音转文字"

When NOT to Use

User wants to synthesize speech from text (use /tts)
User wants to create a podcast or explainer (use /podcast or /explainer)

Purpose

Transcribe audio files to text using coli asr, which runs fully offline via local speech recognition models. No API key required. Supports Chinese, English, Japanese, Korean, and Cantonese (sensevoice model) or English-only (whisper model).

Run coli asr --help for current CLI options and supported flags.

Hard Constraints

No shell scripts. Use direct commands only.
Always read config following before any interaction

<HARD-GATE> Use the AskUserQuestion tool for every multiple-choice step — do NOT print options as plain text. Ask one question at a time. Wait for the user's answer before proceeding. After all parameters are collected, summarize and ask the user to confirm before running any transcription. </HARD-GATE>

When to Use

User wants to transcribe an audio file to text
User provides an audio file path and asks for transcription
User says "转录", "识别", "transcribe", "语音转文字"

When NOT to Use

User wants to synthesize speech from text (use /tts)
User wants to create a podcast or explainer (use /podcast or /explainer)

Purpose

Run coli asr --help for current CLI options and supported flags.

Hard Constraints

No shell scripts. Use direct commands only.
Always read config following before any interaction

Issue	Action
`coli` not found	Block. Tell user to run `npm install -g @marswave/coli` first
`ffmpeg` not found	Warn (WAV files still work). Suggest `brew install ffmpeg` / `sudo apt install ffmpeg`
Models not downloaded	Inform user: first transcription will auto-download models (~60MB) to `~/.coli/models/`

Asr

When to Use

When NOT to Use

Purpose

Hard Constraints

Asr

When to Use

When NOT to Use

Purpose

Hard Constraints

Interaction Flow

Step 0: Prerequisites Check

Step 0: Config Setup

Setup Flow (first run or reconfigure)

Step 1: Get Audio File

Step 2: Confirm

Step 3: Transcribe

Step 4: Polish (if enabled)

Step 5: Present Result

Step 6: Export as Markdown (optional)

Songsee

Video Frames

Gifgrep

Qqbot Media

Camsnap

Openai Whisper Api