Kuaishou Tech
Sep 16, 2025 · Artificial Intelligence
How Kling-Avatar Generates Long, Emotionally Rich Digital Human Videos with Multimodal LLMs
Kuaishou's Kling-Avatar leverages a multimodal large‑language‑model‑driven two‑stage generation framework to produce minute‑long digital‑human videos that synchronize lip movements, facial expressions, and body gestures with audio, achieving high visual quality, identity consistency, and controllable storytelling across diverse scenarios.
AI Avatardigital humanlong video synthesis
0 likes · 9 min read
