Runflow
Back to Gallery

alibaba/happy-horse/text-to-video

Generate 1080p video with synchronized native audio from a text prompt. Aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4. Duration: 3–15s.

$0.14/second 2026-04-28

Input

Your request will cost$1.400

Per second of generated video (720p baseline)

Output

Example

Example output from Happy Horse Text-to-Video

Pricing

Criteria

Per second of generated video (720p baseline)

$0.14

per second of video

7s of video for $1

Criteria

1080p

$0.28

per second of video

3s of video for $1

Overview

Happy Horse Text-to-Video is Alibaba's flagship 1080p video generator with synchronized native audio built in — no separate audio model, no lip-sync rig, no post-production. Send a single prompt; get a fully-scored clip back.

Key capabilities

  • Native audio: ambient sound, music, voice, foley — generated in lock-step with the visuals so they actually match (no overlay tricks)
  • Multilingual: prompts and any embedded dialogue work across major languages
  • Five aspect ratios: 16:9, 9:16, 1:1, 4:3, 3:4 — covers landscape ads, vertical short-form, square social, and portrait formats from one endpoint
  • 3-15 second clips at 720p or 1080p
  • Cinematic motion: handles complex camera moves (dolly, push-in, aerial), shallow DOF, golden-hour lighting prompts well

Family

Part of the Happy Horse family — pair with the variants when you need a different starting modality:

VariantInputUse it for
Text-to-Videotext promptone-shot clips from a brief
Image-to-Videoimage + optional promptanimating a still or hero shot
Video Editsource video + edit prompttransforming an existing clip (style, scene swap)
Reference-to-Videotext + 1-9 referencesmulti-character scenes, brand-consistent subjects

Tech specs

  • Resolutions: 720p, 1080p
  • Duration: 3-15s
  • Audio: native, in-sync, prompt-controlled
  • Latency: 60-180s typical for a 5s clip; queue depth varies during peak hours
  • Pricing: $0.14/s at 720p, $0.28/s at 1080p — simple per-second billing, no minimums

Frequently asked questions

Related models

Start generating with Happy Horse Text-to-Video

Get API access in minutes. No GPU setup, no infrastructure to manage.