Tagged articles
8 articles
Page 1 of 1
AI Explorer
AI Explorer
Apr 24, 2026 · Artificial Intelligence

Open Generative AI: 200+ Open‑Source Models for Image, Video, and Lip‑Sync Creation

Open Generative AI is an open‑source, MIT‑licensed desktop suite that bundles over 200 cutting‑edge image, video, and lip‑sync models into four dedicated studios, offering unrestricted generation without content filters, subscription fees, or closed ecosystems, and provides online, desktop, and self‑hosted deployment options.

AI media generationMIT licenseOpen Generative AI
0 likes · 6 min read
Open Generative AI: 200+ Open‑Source Models for Image, Video, and Lip‑Sync Creation
Bilibili Tech
Bilibili Tech
Aug 5, 2025 · Artificial Intelligence

How Bilibili’s IndexTTS2 Achieves Real‑Time, Emotion‑Rich Voice Translation

IndexTTS2 introduces a cross‑modal, multi‑language voice translation system that preserves speaker identity, acoustic space, and multi‑source timbre, while tackling challenges like voice personality loss, subtitle cognitive load, localization costs, multi‑speaker diarization, and cultural adaptation through novel time‑coding, adversarial RL, and diffusion‑based lip‑sync techniques.

Multimodal AISpeech synthesisadversarial reinforcement learning
0 likes · 20 min read
How Bilibili’s IndexTTS2 Achieves Real‑Time, Emotion‑Rich Voice Translation
DaTaobao Tech
DaTaobao Tech
Jun 30, 2025 · Artificial Intelligence

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

This article outlines the end‑to‑end architecture and practical solutions behind creating intelligent digital humans for live commerce, covering LLM‑driven content generation, real‑time lip‑sync, image‑driven avatar creation, automated material review, lightweight model training, and a roadmap toward fully automated, high‑performance virtual presenters.

AIDigital HumanLLM
0 likes · 19 min read
One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech
DaTaobao Tech
DaTaobao Tech
Feb 24, 2025 · Artificial Intelligence

AIGC Video Generation Techniques for E‑commerce: Lip‑Sync, Head/Body Driving, and Business Applications

The article surveys recent AIGC video generation advances for Taobao e‑commerce, detailing lip‑sync models like Wav2Lip and MuseTalk, head‑driven systems such as Hallo and EchoMimic, body‑driven pipelines including AnimateAnyone and Tango, and a four‑stage production workflow that boosts click‑through rates and enables virtual try‑on.

AIGCDeep LearningMultimodal AI
0 likes · 21 min read
AIGC Video Generation Techniques for E‑commerce: Lip‑Sync, Head/Body Driving, and Business Applications
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Nov 9, 2023 · Artificial Intelligence

How Wav2Lip Achieves Accurate Speech‑Driven Lip Sync with Expert Discriminators

The article analyzes the limitations of traditional speech‑driven lip‑sync methods and explains how Wav2Lip introduces a pretrained multi‑frame expert sync discriminator, a two‑stage GAN training pipeline, and a specialized generator architecture to produce high‑quality, audio‑aligned facial videos.

Computer VisionDeep LearningGAN
0 likes · 7 min read
How Wav2Lip Achieves Accurate Speech‑Driven Lip Sync with Expert Discriminators
DataFunTalk
DataFunTalk
Oct 6, 2023 · Artificial Intelligence

Music‑Driven Digital Human: Algorithms, System Architecture, and Practical Applications

This article presents a comprehensive overview of the Music XR Maker framework, detailing how music‑driven AI techniques enable digital human creation, dance generation, lip‑sync, and expressive performance, and discusses data pipelines, model architectures, 3D rendering, product integration, and real‑time deployment within Tencent Music’s Tianqin Lab.

AI AlgorithmsDance GenerationDigital Human
0 likes · 15 min read
Music‑Driven Digital Human: Algorithms, System Architecture, and Practical Applications
DataFunSummit
DataFunSummit
May 15, 2023 · Artificial Intelligence

Music-Driven Digital Human: Algorithms and Practices

This article presents the Music XR Maker framework and its four core components—music-driven system architecture, dance generation, lip-sync driven by singing voice, and expressive singing facial animation—detailing data sources, AI generation pipelines, 3D rendering, product applications, and future research directions.

3D renderingAIDance Generation
0 likes · 15 min read
Music-Driven Digital Human: Algorithms and Practices