How Alibaba’s TaoAvatar Brings Real‑Time 3D Digital Humans to Your Phone
TaoAvatar, Alibaba’s new 3D digital‑human platform, enables lifelike, real‑time avatars on mobile and XR devices by combining 3D Gaussian splatting, on‑device AI dialogue, and a lightweight MNN inference engine, and the full source code is now open‑sourced as MNN‑TaoAvatar.
TaoAvatar is a 3D real‑time digital‑human technology developed by Alibaba’s Taotian Meta team, capable of rendering realistic avatars and supporting AI‑driven dialogue on smartphones and XR devices.
Open‑Source Release
The application MNN‑TaoAvatar has been open‑sourced on the MNN GitHub repository, allowing developers to download, build, and experiment with the full system.
Key Research Paper
Title: TaoAvatar: Real‑Time Lifelike Full‑Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
arXiv: https://arxiv.org/abs/2503.17032v1
Code: GitHub README
Technical Highlights
TaoAvatar uses advanced 3D Gaussian splatting to generate full‑body avatars from multi‑view video, capturing fine facial expressions, hand gestures, clothing folds, and hair motion with high realism.
Compared with cloud‑based solutions, MNN‑TaoAvatar runs entirely on‑device, eliminating the need for powerful GPUs and reducing latency.
Core Advantages
On‑device Real‑time Dialogue : Optimized ASR (RTF 0.18), LLM (pre‑fill 165 tokens/s, decode 41 tokens/s), and TTS (RTF 0.58) models enable sub‑second response.
On‑device Real‑time Rendering : Efficient pipeline converts speech to facial blend‑shape coefficients (RTF 0.34) and renders 250k‑point clouds at 60 FPS using the NNR renderer.
MNN Engine Modules
MNN‑LLM : Mobile deployment of large language models with model export, quantization (e.g., Qwen2.5‑1.5B reduced to 1.2 GB), KV cache, and LoRA support. Benchmarks on Snapdragon 8 Gen 3 show 8.6×‑20.5× speedup over llama.cpp and fastllm.
Sherpa‑MNN : Optimized ASR/TTS framework delivering up to 2× faster inference and 5× smaller binary than onnxruntime.
MNN‑NNR : Neural‑network‑based renderer with "Dirty" scheduling, GPU‑direct data sharing, and radix sort, achieving 60 FPS with only 200 KB runtime size.
Hardware Requirements
Qualcomm Snapdragon 8 Gen 3 or equivalent CPU
≥ 8 GB RAM
≥ 5 GB storage for model files
Devices not meeting these specs may experience stutter or limited functionality.
Quick Start Guide
git clone https://github.com/alibaba/MNN.git
cd apps/Android/Mnn3dAvatar
./gradlew installDebugAfter building, the app can be run on a compatible Android phone to experience real‑time avatar interaction.
Resources
TaoAvatar GitHub: https://github.com/alibaba/MNN/blob/master/apps/Android/MnnTaoAvatar/README_CN.md
Paper: https://arxiv.org/abs/2503.17032v1
MNN‑LLM paper: https://arxiv.org/abs/2506.10443
Model collections, ASR/TTS/NNR models and demo links (see original article for URLs)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
