Name: ElevenLabs Speech-to-Text
Author: pwoprurho

Buscar habilidades.../

ElevenLabs Speech-to-Text | Skills Pool

infsh app run elevenlabs/stt --input '{"audio": "https://meeting-recording.mp3"}'

infsh app run elevenlabs/stt --input '{
  "audio": "https://meeting.mp3",
  "diarize": true
}'

infsh app run elevenlabs/stt --input '{
  "audio": "https://podcast.mp3",
  "tag_audio_events": true
}'

infsh app run elevenlabs/stt --input '{
  "audio": "https://spanish-audio.mp3",
  "language_code": "spa"
}'

infsh app run elevenlabs/stt --input '{
  "audio": "https://conference.mp3",
  "model": "scribe_v2",
  "diarize": true,
  "tag_audio_events": true,
  "language_code": "eng"
}'

infsh app run elevenlabs/forced-alignment --input '{
  "audio": "https://narration.mp3",
  "text": "This is the exact text spoken in the audio file."
}'

{
  "words": [
    {"text": "This", "start": 0.0, "end": 0.3},
    {"text": "is", "start": 0.35, "end": 0.5},
    {"text": "the", "start": 0.55, "end": 0.65}
  ],
  "text": "This is the exact text spoken in the audio file."
}

# 1. Transcribe video audio
infsh app run elevenlabs/stt --input '{
  "audio": "https://video.mp4",
  "diarize": true
}' > transcript.json

# 2. Use transcript for captions
infsh app run infsh/caption-videos --input '{
  "video_url": "https://video.mp4",
  "captions": "<transcript-from-step-1>"
}'

# ElevenLabs TTS (reverse direction)
npx skills add inference-sh/skills@elevenlabs-tts

# ElevenLabs dubbing (translate audio)
npx skills add inference-sh/skills@elevenlabs-dubbing

# Other STT models (Whisper)
npx skills add inference-sh/skills@speech-to-text

# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@infsh-cli

Model	ID	Best For
Scribe v2	`scribe_v2`	Latest, highest accuracy (default)
Scribe v1	`scribe_v1`	Stable, proven

ElevenLabs Speech-to-Text

Quick Start

Available Models

ElevenLabs Speech-to-Text

Quick Start

Available Models

Examples

Basic Transcription

With Speaker Identification

Audio Event Tagging

Specify Language

Full Options

Forced Alignment

Output Format

Forced Alignment Use Cases

Workflow: Video Subtitles

Supported Languages

Use Cases

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio

ElevenLabs Speech-to-Text

Quick Start

Available Models

ElevenLabs Speech-to-Text

Quick Start

Available Models

Examples

Basic Transcription

With Speaker Identification

Audio Event Tagging

Specify Language

Full Options

Forced Alignment

Output Format

Forced Alignment Use Cases

Workflow: Video Subtitles

Supported Languages

Use Cases

Related Skills

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio