Args: segments: List of dicts with fields text, emotion, and emotion_sample (all required); Optionally include ref_emotion_voice per segment as a local audio file path or remote URL; when omitted/empty, defaults to ref_voice. ref_voice: Local audio file path or remote URL to use as reference for voice characteristics and timbre

Returns: List of generated speech audio segments as HTTP URLs (http://host:port/file/path) in WAV format

Example:
segments = [
    {
        "text": "Hi there! It is great to meet you.",
        "emotion": "happy",
        "emotion_sample": "I am so glad we finally get to meet in person!",
        "ref_emotion_voice": "/path/to/emotion_voice.wav"
    },
    {
        "text": "This is serious, so please pay attention.",
        "emotion": "serious",
        "emotion_sample": "This is serious, so please pay attention.",
        "ref_emotion_voice": "/path/to/emotion_voice.wav"
    }
]
audio = await text_segments_to_speech(segments, ref_voice="/path/to/reference_voice.wav")

Usage

Returns: List of generated speech audio segments as HTTP URLs (http://host:port/file/path) in WAV format

Example:
segments = [
    {
        "text": "Hi there! It is great to meet you.",
        "emotion": "happy",
        "emotion_sample": "I am so glad we finally get to meet in person!",
        "ref_emotion_voice": "/path/to/emotion_voice.wav"
    },
    {
        "text": "This is serious, so please pay attention.",
        "emotion": "serious",
        "emotion_sample": "This is serious, so please pay attention.",
        "ref_emotion_voice": "/path/to/emotion_voice.wav"
    }
]
audio = await text_segments_to_speech(segments, ref_voice="/path/to/reference_voice.wav")

Media Text Segments To Speech

Usage

Media Text Segments To Speech

Usage

Arguments Schema

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api