Human-to-song interface using deterministic audio analysis + Q&A. Trigger this whenever the user wants to talk to a song/audio file, ask what is happening in a track, detect tempo/key/sections/energy, inspect metadata/codec, compare segments, or run conversational analysis on MP3/FLAC/WAV/OGG/M4A.
Use this skill as the conversation layer between a human and an audio file.
Use when user says things like:
Do not improvise music claims without evidence. Run the CLI first, then answer.
sh "$SKILLS_DIR/vox-interpres/scripts/vox-interpres.sh" ...uv run --project <skill-dir> vox-interpres ...sh "$SKILLS_DIR/vox-interpres/scripts/vox-interpres.sh" ask ./song.flac "tempo and key?" --refresh
sh "$SKILLS_DIR/vox-interpres/scripts/vox-interpres.sh" analyze ./song.mp3 --json
sh "$SKILLS_DIR/vox-interpres/scripts/vox-interpres.sh" ask ./song.wav "where are the sections?"
sh "$SKILLS_DIR/vox-interpres/scripts/vox-interpres.sh" chat ./song.ogg
cookbook/basics.mdcookbook/question-patterns.mdcookbook/advanced-workflows.mdreferences/cheatsheet.mdreferences/output-contract.mdreferences/troubleshooting.mdreferences/roadmap.mdevals/evals.jsonreferences/skill-creator-validation.md