스킬 파일

Ltx Video

Name: Ltx Video
Author: LeoYeAI

Generate videos via LTX-2.3 API (ltx.video). Supports text-to-video, image-to-video, audio-to-video (lip-sync from audio + image), extend, and retake. Use when: generating AI video from text/image/audio, animating a portrait, creating lip-sync video from an existing image + audio recording.

LeoYeAI1,891 스타2026. 3. 9.

직업
카테고리: 미디어

스킬 내용

LTX-2.3 Video API

API Reference

Base URL: https://api.ltx.video/v1
Auth: Authorization: Bearer <API_KEY>
Response: MP4 binary (direct download, no polling)

Endpoints

Endpoint	Input	Use
`/v1/text-to-video`	prompt	Generate video from text
`/v1/image-to-video`	image_uri + prompt	Animate a still image
`/v1/audio-to-video`	audio_uri + image_uri + prompt	Lip-sync video from audio + image
`/v1/extend`	video_uri + prompt	Extend a video at start or end
`/v1/retake`	video_uri + time range

관련 스킬

Ltx Video | Skills Pool

curl -X POST "https://api.ltx.video/v1/text-to-video" \
  -H "Authorization: Bearer $LTX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A man in a navy blue suit sits at a luxury restaurant table...",
    "model": "ltx-2-3-pro",
    "duration": 8,
    "resolution": "1920x1080"
  }' -o output.mp4

curl -X POST "https://api.ltx.video/v1/audio-to-video" \
  -H "Authorization: Bearer $LTX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_uri": "https://example.com/voice.mp3",
    "image_uri": "https://example.com/portrait.jpg",
    "prompt": "A man speaks directly to camera...",
    "model": "ltx-2-3-pro",
    "resolution": "1920x1080"
  }' -o output.mp4

import requests

def ltx_audio_to_video(audio_url, image_url, prompt, api_key,
                        model="ltx-2-3-pro", resolution="1920x1080",
                        output_path="output.mp4"):
    r = requests.post(
        "https://api.ltx.video/v1/audio-to-video",
        headers={"Authorization": f"Bearer {api_key}",
                 "Content-Type": "application/json"},
        json={"audio_uri": audio_url, "image_uri": image_url,
              "prompt": prompt, "model": model, "resolution": resolution},
        timeout=300, stream=True
    )
    if r.status_code != 200:
        raise RuntimeError(f"LTX error {r.status_code}: {r.text}")
    with open(output_path, "wb") as f:
        for chunk in r.iter_content(8192): f.write(chunk)
    return output_path

# Upload MP3 to uguu.se
AUDIO_URL=$(curl -s -F "files[][email protected]" "https://uguu.se/upload" | \
  python3 -c "import sys,json; print(json.load(sys.stdin)['files'][0]['url'])")

# Upload image
IMAGE_URL=$(curl -s -F "files[][email protected]" "https://uguu.se/upload" | \
  python3 -c "import sys,json; print(json.load(sys.stdin)['files'][0]['url'])")

# Resize large images before upload
ffmpeg -y -i input_4k.png -vf "scale=1920:1080" output_1080.jpg

A [description of person] sits/stands [location]. He/she speaks directly 
to camera, lips moving in perfect sync with his/her voice. [Gesture details]. 
Head stays level and gaze remains locked on camera throughout. 
[Environment description softly blurred in background]. 
[Lighting]. [Camera: holds steady at eye level, front-on].

cd ComfyUI/custom_nodes
git clone https://github.com/PauldeLavallaz/comfyui-ltx-node

ltxv_RfSU5hdKJb_g5dwbECZWnilE1P8dJzbavz6niP_0LQJ942ARHIVhrBCfebcytEL1efLVx_63S_PJyWTzicrBcWEkOXfCbGTl8JSzlJJk329MwRViEgOoE2KnE9LIA5t6QSFeBy7DLnTIcX0AZNbV9Jv0TuC7qcq2gV33G6ROhUVUDCuN

Model	Speed	Quality
`ltx-2-3-fast`	~17s	Good (use for tests)
`ltx-2-3-pro`	~30-60s	Best (use for final)

Ltx Video

LTX-2.3 Video API

API Reference

Endpoints

Ltx Video

LTX-2.3 Video API

API Reference

Endpoints

Models

Supported Resolutions

Quick Examples

Text to Video

Audio to Video (Lip-sync)

Python Wrapper

⚠️ Critical Rules (learned from experience)

File Hosting

Image Size Limit

Face Consistency

Last Frame

Prompting Guide (LTX-2.3)

1. Use Verbs, Not Nouns

2. Block the Scene Like a Director

3. Describe Audio Explicitly (for text-to-video)

4. Avoid Static Photo-Like Prompts

5. Describe Texture and Material

6. Portrait (9:16) Native

7. Complex Shots Work Now

Lip-Sync Prompt Template

ComfyUI Node

API Key

Songsee

Video Frames

Gifgrep

Qqbot Media

Camsnap

Openai Whisper Api