JD Retail Technology
Jul 1, 2025 · Artificial Intelligence
JoyGen: Audio‑Driven 3D Depth‑Aware Talking‑Face Video Editing Explained
JoyGen introduces a two‑stage framework that generates high‑quality talking‑face videos by synchronizing lip movements with input audio using 3DMM‑based identity and expression coefficients, depth‑aware supervision, and a newly built high‑resolution Chinese speaking‑face dataset, achieving state‑of‑the‑art performance on multiple benchmarks.
3DMMAIGCaudio-driven video
0 likes · 13 min read
