Generate subtitles with automatic time alignment using Volcengine ATA API. Use when the user wants to: (1) add time-aligned subtitles to videos, (2) convert text + audio to SRT/ASS format, or (3) automate subtitle creation workflow.
Generate subtitles with automatic time alignment using Volcengine's ATA (Automatic Time Alignment) API.
Set the following environment variables or create a config file:
export VOLC_ATA_APP_ID="your-app-id"
export VOLC_ATA_TOKEN="your-access-token"
export VOLC_ATA_API_BASE="https://openspeech.bytedance.com"
Create ~/.volcengine_ata.conf:
[credentials]
appid = your-app-id
access_token = your-access-token
secret_key = your-secret-key
[api]
base_url = https://openspeech.bytedance.com
submit_path = /api/v1/vc/ata/submit
query_path = /api/v1/vc/ata/query
A Python CLI tool is provided at ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py.
# Basic usage: audio + text → SRT subtitle
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
--audio storage/audio.wav \
--text storage/subtitle.txt \
--output storage/subtitles/final.srt
# Specify output format (srt or ass)
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
--audio storage/audio.wav \
--text storage/subtitle.txt \
--output storage/subtitles/final.ass \
--format ass
pcm_s16le)Extract from video:
ffmpeg -i input.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav
Example:
主人闹钟没响睡过头了
我们俩轮流用鼻子拱他脸
他以为地震了抱着枕头就跑
1