Video editing using Volcengine Track structure. Supports cutting, trimming, adding text, stickers, audio, filters, effects, transitions, multi-clip compositions, speed adjustment, watermark removal. 视频剪辑、裁剪视频、添加文字、添加水印、添加音频、视频滤镜、视频特效、视频转场、多片段拼接、调整速度、去水印。
Performs video editing using the Volcengine cloud editing API with native Track structure.
ALL PosX/PosY values in this API are TOP-LEFT corner coordinates, NOT center point!
This applies to ALL filters: transform, crop, delogo, etc.
For full-screen video display, you MUST use:
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": <canvas_width>, "Height": <canvas_height> }
NEVER use canvas center coordinates (like 640, 360) for PosX/PosY!
Use this skill when:
ALWAYS call uploadAndGetVid first to:
vid://xxx)If you only need metadata without editing (e.g., for display purposes), use probeVideoMetadata instead.
Construct your EditParam following the structure below.
Call submitDirectEditTask with Canvas, Output (optional), and Track.
getVideoEditTaskStatus to check progressOn completion, return the output video URL to user.
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Output": { "Fps": 30, "Codec": { "VideoCodec": "h264" } },
"Track": [
[{ "Type": "video", "Source": "vid://xxx", "TargetTime": [0, 5000] }],
[{ "Type": "text", "Text": "Hello", "TargetTime": [0, 5000], "Extra": [] }]
]
}
Canvas (optional):
**Time unit**: All time values are in **MILLISECONDS** (1 second = 1000 ms)
## Track Structure
Track is a **2D array** `Track[layerIndex][elementIndex]`:
- Outer array: Rendering layers (higher index = higher layer, renders on top)
- Inner array: Elements on the same track
## Element Types
### video
```json
{
"Type": "video",
"Source": "vid://your_vid",
"TargetTime": [0, 5000],
"Extra": [
{ "Type": "trim", "StartTime": 10000, "EndTime": 30000 }
]
}
{
"Type": "audio",
"Source": "vid://your_audio_vid",
"TargetTime": [0, 10000],
"Extra": [
{ "Type": "a_volume", "Volume": 0.5 }
]
}
When to use audio elements:
When NOT to use audio elements:
{
"Type": "image",
"Source": "mid://your_image_mid",
"TargetTime": [0, 3000],
"Extra": [
{ "Type": "transform", "PosX": 100, "PosY": 100, "Width": 200, "Height": 200 }
]
}
{
"Type": "text",
"Text": "Your Text Here",
"TargetTime": [0, 5000],
"FontType": "SY_Black",
"FontSize": 120,
"FontColor": "#FFFFFFFF",
"AlignType": 1,
"Extra": [
{ "Type": "transform", "PosX": 60, "PosY": 440, "Width": 1800, "Height": 200 }
]
}
{
"Type": "subtitle",
"Text": "http://example.com/subtitle.srt",
"TargetTime": [0, 20000],
"FontType": "ALi_PuHui",
"FontSize": 60,
"Extra": [
{ "Type": "transform", "PosX": 40, "PosY": 550, "Width": 1200, "Height": 150 }
]
}
Controls element position, size, rotation, and opacity on the canvas.
IMPORTANT:
PosX and PosY are the TOP-LEFT corner coordinates of the element relative to the canvas top-left cornerParameters:
PosX: X coordinate of element's top-left corner (pixels)PosY: Y coordinate of element's top-left corner (pixels)Width: Element width on canvas (pixels)Height: Element height on canvas (pixels)Rotation: Rotation angle [-360, 360], clockwise is positive (optional)Alpha: Opacity [0,1], 0 is transparent (optional)Basic Example (fill canvas):
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": 1920, "Height": 1080 }
Rotation Example (rotate 1280x720 video 90° to fit 720x1280 canvas):
When rotating, the element rotates around its center. To fill a 720x1280 canvas with a 1280x720 video rotated 90°:
{ "Type": "transform", "PosX": -280, "PosY": 280, "Width": 1280, "Height": 720, "Rotation": 90 }
Centering Formula:
PosX = (CanvasWidth - ElementWidth) / 2PosY = (CanvasHeight - ElementHeight) / 2Example: Center a 600x60 subtitle on 720x1280 canvas:
Common Mistakes:
{ "Type": "trim", "StartTime": 5000, "EndTime": 15000 }
{ "Type": "crop", "PosX": 100, "PosY": 100, "Width": 800, "Height": 600 }
{ "Type": "speed", "Speed": 2.0 }
{ "Type": "transition", "Source": "1182376", "Duration": 1000 }
{ "Type": "lut_filter", "TargetTime": [0, 5000], "Source": "1184003", "Intensity": 0.8 }
{ "Type": "video_animation", "AnimRes": "1180337", "AnimStartTime": 0, "AnimEndTime": 1000 }
{ "Type": "a_volume", "Volume": 0.5 }
{ "Type": "a_fade", "FadeIn": 2000, "FadeOut": 2000 }
{ "Type": "delogo", "TargetTime": [0, 10000], "PosX": 1700, "PosY": 50, "Width": 200, "Height": 80, "Sigma": 30, "Radius": 30 }
| ID | Name | Description |
|---|---|---|
| 1184003 | Clear | Enhanced clarity |
| 1184004 | Afternoon | Warm afternoon tone |
| 1183995 | Vintage | Retro film look |
| 1183993 | Friends | Friends TV show style |
| ID | Name | Description |
|---|---|---|
| 1182376 | CircleOpen | Circle wipe open |
| 1182360 | RotateZoom | Rotate and zoom |
| 1182370 | DoorOpen | Door opening reveal |
| 1182379 | ClockWipe | Clock sweep wipe |
| ID | Name | Type |
|---|---|---|
| 1180337 | FadeIn | In |
| 1180382 | FadeOut | Out |
| 1180335 | Shrink | In |
| 1180338 | ZoomIn | In |
| ID | Name |
|---|---|
| SY_Black | Source Han Sans Black |
| ALi_PuHui | Alibaba PuHuiTi |
| PM_ZhengDao | Pangmen Zhengdao Title |
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [[
{
"Type": "video",
"Source": "vid://video1",
"TargetTime": [0, 5000],
"Extra": [
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": 1920, "Height": 1080 }
]
},
{
"Type": "video",
"Source": "vid://video2",
"TargetTime": [5000, 12000],
"Extra": [
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": 1920, "Height": 1080 }
]
},
{
"Type": "video",
"Source": "vid://video3",
"TargetTime": [12000, 20000],
"Extra": [
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": 1920, "Height": 1080 }
]
}
]]
}
Key points for video concatenation:
PosX: 0, PosY: 0 is REQUIRED for full-screen display{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [[
{
"Type": "video",
"Source": "vid://video1",
"TargetTime": [0, 5000],
"Extra": [
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": 1920, "Height": 1080 },
{ "Type": "transition", "Source": "1182376", "Duration": 1000 }
]
},
{
"Type": "video",
"Source": "vid://video2",
"TargetTime": [4000, 9000],
"Extra": [
{ "Type": "transform", "PosX": 0, "PosY": 0, "Width": 1920, "Height": 1080 }
]
}
]]
}
Note: With transitions, TargetTime values overlap (video2 starts at 4000, video1 ends at 5000).
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [[
{
"Type": "video",
"Source": "vid://xxx",
"TargetTime": [0, 20000],
"Extra": [
{ "Type": "trim", "StartTime": 10000, "EndTime": 30000 }
]
}
]]
}
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [
[{ "Type": "video", "Source": "vid://xxx", "TargetTime": [0, 10000] }],
[{
"Type": "text",
"Text": "@MyChannel",
"TargetTime": [0, 10000],
"FontType": "SY_Black",
"FontSize": 60,
"FontColor": "#FFFFFFFF",
"Extra": [
{ "Type": "transform", "PosX": 1700, "PosY": 50, "Width": 200, "Height": 60 }
]
}]
]
}
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [
[{ "Type": "video", "Source": "vid://video", "TargetTime": [0, 30000] }],
[{
"Type": "audio",
"Source": "vid://music",
"TargetTime": [0, 30000],
"Extra": [
{ "Type": "a_volume", "Volume": 0.3 }
]
}]
]
}
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [[
{
"Type": "video",
"Source": "vid://xxx",
"TargetTime": [0, 10000],
"Extra": [
{ "Type": "lut_filter", "TargetTime": [0, 10000], "Source": "1183993", "Intensity": 0.8 }
]
}
]]
}
{
"Canvas": { "Width": 1920, "Height": 1080 },
"Track": [[
{
"Type": "video",
"Source": "vid://xxx",
"TargetTime": [0, 10000],
"Extra": [
{ "Type": "delogo", "TargetTime": [0, 10000], "PosX": 1700, "PosY": 50, "Width": 200, "Height": 80, "Sigma": 30, "Radius": 30 }
]
}
]]
}
Convert a 1280x720 (landscape) video to 720x1280 (portrait) canvas:
{
"Canvas": { "Width": 720, "Height": 1280 },
"Track": [[
{
"Type": "video",
"Source": "vid://xxx",
"TargetTime": [0, 10000],
"Extra": [
{
"Type": "transform",
"PosX": -280,
"PosY": 280,
"Width": 1280,
"Height": 720,
"Rotation": 90
}
]
}
]]
}
Calculation: