Generate audio replies using TTS. Trigger with "read it to me [URL]" to fetch and read content aloud, or "talk to me [topic]" to generate a spoken response. Also responds to "speak", "say it", "voice reply".
Generate spoken audio responses using MLX Audio TTS (chatterbox-turbo model).
User: read it to me https://example.com/article
User: talk to me about the weather today
uv run mlx_audio.tts.generate \
--model mlx-community/chatterbox-turbo-fp16 \
--text "Your text here" \
--play \
--file_prefix /tmp/audio_reply
--model mlx-community/chatterbox-turbo-fp16 - Fast, natural voice--play - Auto-play the generated audio--file_prefix - Save to temp location for cleanup--exaggeration 0.3 - Optional: add expressiveness (0.0-1.0)--speed 1.0 - Adjust speech rate if neededFor "read it to me" mode:
For "talk to me" mode:
Always delete the audio file after playing - it's already in the chat history.
# Generate with unique filename and play
OUTPUT_FILE="/tmp/audio_reply_$(date +%s)"
uv run mlx_audio.tts.generate \
--model mlx-community/chatterbox-turbo-fp16 \
--text "Your response text" \
--play \
--file_prefix "$OUTPUT_FILE"
# ALWAYS clean up after playing
rm -f "${OUTPUT_FILE}"*.wav 2>/dev/null
If TTS fails:
uv is installed and in PATHUser: read it to me https://blog.example.com/new-feature
Assistant actions:
1. WebFetch the URL
2. Extract article content
3. Generate TTS:
uv run mlx_audio.tts.generate \
--model mlx-community/chatterbox-turbo-fp16 \
--text "Here's what I found... [article summary]" \
--play --file_prefix /tmp/audio_reply_1706123456
4. Delete: rm -f /tmp/audio_reply_1706123456*.wav
5. Confirm: "Done reading the article to you."
User: talk to me about what you can help with
Assistant actions:
1. Generate conversational response text
2. Generate TTS:
uv run mlx_audio.tts.generate \
--model mlx-community/chatterbox-turbo-fp16 \
--text "Hey! So I can help you with all kinds of things..." \
--play --file_prefix /tmp/audio_reply_1706123789
3. Delete: rm -f /tmp/audio_reply_1706123789*.wav
4. (No text output needed - audio IS the response)
--play flag uses system audio - ensure volume is up