Yore Vocabulary LLM Filter (Whisper Vocabulary)

Use this skill when you want to enrich Whisper's vocabulary with only the narrow subset of tokens it is unlikely to recognize reliably from generic software knowledge.

The goal is Whisper transcription accuracy: keep terms that a speech-to-text model would misspell, skip, or hallucinate — even if those terms look "unspeakable" as text (acronyms like bm25, project names like yore). Drop terms that are phonetically clear English words regardless of how technical they are.

Prerequisites

You have candidate terms from yore-vocabulary-harvest (JSON or newline list).
gptengage is installed:

command -v ~/.local/bin/gptengage

Optional stopword path you want the model to consider as already-known vocabulary.

Input

Yore Vocabulary LLM Filter (Whisper Vocabulary)

Use this skill when you want to enrich Whisper's vocabulary with only the narrow subset of tokens it is unlikely to recognize reliably from generic software knowledge.

Prerequisites

You have candidate terms from yore-vocabulary-harvest (JSON or newline list).
gptengage is installed:

command -v ~/.local/bin/gptengage

Optional stopword path you want the model to consider as already-known vocabulary.

Input

Whisper is a speech-to-text model. You are classifying corpus-derived terms to decide which ones belong in a custom Whisper vocabulary file that biases transcription toward correct spelling. IMPORTANT CONTEXT: The Whisper output is consumed by an LLM, not a human. An LLM will easily resolve minor transcription variations like missing hyphens, word splits, or casing ("fix links" vs "fix-links", "change log" vs "changelog", "stop word" vs "stopword"). Therefore: - Drop any compound term where both component words are recognizable English. The LLM reader will understand "canonical orphans" just as well as "canonical-orphans". - Only keep terms where the PHONEMES THEMSELVES produce a wrong word — where Whisper would output a completely different word or nonsense, not just a formatting variant. For each term, return a JSON object with: term, verdict, category, reason. Verdicts: - "keep": Whisper would produce a WRONG WORD or nonsense from the audio (e.g., "yore" → "your", "toml" → "tommel", "bm25" → "be em twenty five"). - "drop": Whisper would produce recognizable output that an LLM can interpret, even if formatting differs (hyphens, spacing, casing). - "review": Genuinely borderline — the user should decide. - "artifact": Stemming fragment or index noise, not a real term. Categories (pick the most specific one): - "acronym": e.g., BM25, ADR, LLM, CLI, TOML, YML - "project-name": e.g., yore, gptengage - "proper-noun": e.g., GitHub, Anthropic, Kubernetes - "jargon": domain term with non-obvious spelling - "phonetically-clear": sounds like normal English when spoken aloud - "compound-clear": compound of recognizable English words (drop these) - "stemming-artifact": truncated stem from indexing, not a real word - "other": none of the above Return strict JSON only — an object with a single key "terms" containing an array: { "terms": [ {"term": "bm25", "verdict": "keep", "category": "acronym", "reason": "Whisper would produce 'be em twenty five' — wrong word entirely"}, {"term": "fix-links", "verdict": "drop", "category": "compound-clear", "reason": "Both words are clear English; LLM handles 'fix links' fine"} ] } No markdown fences, no commentary outside the JSON. If both parts of a compound are recognizable English words, verdict is "drop". If a term is a truncated stem (e.g., enforc, handl, rul, statu), verdict is "artifact". Optimise for precision: only "keep" terms where the phonemes produce a genuinely wrong word.

Yore Vocabulary Llm Filter

Yore Vocabulary LLM Filter (Whisper Vocabulary)

Prerequisites

Input

Yore Vocabulary Llm Filter

Yore Vocabulary LLM Filter (Whisper Vocabulary)

Prerequisites

Input

Workflow

Guardrails

Output

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api