4 specialized modes in one model
4 specialized modes in one model
Generate AI videos with LTX-2 19B via API in One. 4 modes: text-to-video, image-to-video, lipsync, and control. Up to 1080p, 5–20s. 20–512 credits per video.
These capabilities come from the current model config and backend route. Unreleased APIs are not listed.
4 specialized modes in one model
Lipsync from audio input
Video control with pose/depth/canny
Up to 1080p and 20 seconds
Clients call same-origin API routes; the server BFF forwards to the matching Worker.
POST/api/v1/videos/generations24 creditsLTX-2 19B — advanced AI video generation with text-to-video, image-to-video, lipsync, and control modes. Up to 1080p, 5–20 seconds.
promptstringText description (required for text-to-video and image-to-video)
OptionaltypestringMode: "text-to-video", "image-to-video", "lipsync", or "control"
OptionalimagestringImage URL for image-to-video mode
OptionalaudiostringAudio URL for lipsync mode
Optionalaudio_durationnumberAudio duration in seconds for lipsync (5–20)
OptionalvideostringVideo URL for control mode
Optionalvideo_durationnumberVideo duration for control mode (5–20)
OptionalmodestringControl mode: "pose", "depth", or "canny"
Optionalaudio_modestringAudio mode for control: "preserve", "generate", or "none"
OptionalresolutionstringOutput resolution: "480p", "720p", or "1080p"
Optional · Default 720paspect_ratiostringAspect ratio: "16:9" or "9:16" (text-to-video only)
Optional · Default 16:9durationnumberVideo duration in seconds (5–20)
Optional · Default 5seednumberRandom seed for reproducibility (-1 for random)
Optional · Default -1{
"endpoint": "/api/v1/videos/generations",
"headers": {
"Authorization": "Bearer <API_KEY>",
"Content-Type": "application/json"
},
"body": {
"model": "ltx-2",
"prompt": "A cat playing piano in a jazz bar, cinematic lighting",
"resolution": "720p",
"aspect_ratio": "16:9",
"duration": 5
}
}After creating image, video, audio, or tool tasks, poll the real task endpoint for results.
GET/api/v1/tasks/{task_id}Check generation task status and result.20–512 credits per video (~$0.20–$5.12)
These workflows are supported by the current model and backend node.
Generate high-quality videos from text prompts
Animate images into video with image-to-video mode
Create lipsync videos by combining audio with AI video
Apply pose/depth/edge control to existing video
Credits depend on mode, resolution, and duration. Text/image-to-video: 20–76 credits. Lipsync: 32–256 credits. Control: 64–512 credits.
Four modes: text-to-video, image-to-video, lipsync (audio+video), and control (pose/depth/canny). Specify with the "type" parameter.
480p, 720p, and 1080p. Higher resolutions cost more credits.