Tagged articles
43 articles
Page 1 of 1
AI Explorer
AI Explorer
May 4, 2026 · Artificial Intelligence

Fully Automated AI Short‑Video Engine Turns One Sentence Into a Complete Clip

Pixelle-Video lets users input a single topic and, within minutes, automatically generates a short video with script, AI‑generated visuals, synthesized voice‑over, background music, and final rendering, eliminating the need for manual editing or specialized software.

AI video generationComfyUIDigital Human
0 likes · 5 min read
Fully Automated AI Short‑Video Engine Turns One Sentence Into a Complete Clip
Machine Heart
Machine Heart
Mar 31, 2026 · Artificial Intelligence

How JD’s JoyStreamer Achieves Smooth Long‑Form, Free‑Form Digital Human Live Streams

The article details how JD’s JoyStreamer and JoyStreamer‑Flash models overcome text‑control weakness, multimodal conflict, and identity drift to enable long‑duration, free‑state, real‑time interactive digital‑human video generation, surpassing current SOTA models in benchmark scores and reaching 30 FPS inference speed for e‑commerce live streaming.

Digital HumanJoyStreamerReal-time Streaming
0 likes · 12 min read
How JD’s JoyStreamer Achieves Smooth Long‑Form, Free‑Form Digital Human Live Streams
Baidu Tech Salon
Baidu Tech Salon
Nov 6, 2025 · Artificial Intelligence

How Baidu’s Script‑Driven Multimodal Digital Human Won the 2025 WIC Leading Tech Award

Baidu’s award‑winning script‑driven multimodal digital‑human technology, recognized at the 2025 World Internet Conference, showcases breakthroughs in real‑time multimodal coordination, high‑fidelity video generation, and cost‑effective live streaming across e‑commerce, education, and legal sectors.

BaiduDigital HumanMultimodal AI
0 likes · 4 min read
How Baidu’s Script‑Driven Multimodal Digital Human Won the 2025 WIC Leading Tech Award
Kuaishou Tech
Kuaishou Tech
Sep 17, 2025 · Artificial Intelligence

How MIDAS Achieves Real‑Time Multimodal Digital‑Human Video Generation

The MIDAS framework introduced by the Kling Team combines autoregressive video generation with a lightweight diffusion denoising head to deliver real‑time, high‑quality digital‑human synthesis under multimodal control, achieving sub‑500 ms latency, 64× compression, and robust performance across multilingual dialogue, singing, and interactive world modeling tasks.

AIDigital HumanReal-time Video
0 likes · 6 min read
How MIDAS Achieves Real‑Time Multimodal Digital‑Human Video Generation
Kuaishou Tech
Kuaishou Tech
Sep 16, 2025 · Artificial Intelligence

How Kling-Avatar Generates Long, Emotionally Rich Digital Human Videos with Multimodal LLMs

Kuaishou's Kling-Avatar leverages a multimodal large‑language‑model‑driven two‑stage generation framework to produce minute‑long digital‑human videos that synchronize lip movements, facial expressions, and body gestures with audio, achieving high visual quality, identity consistency, and controllable storytelling across diverse scenarios.

AI AvatarDigital HumanVideo Generation
0 likes · 9 min read
How Kling-Avatar Generates Long, Emotionally Rich Digital Human Videos with Multimodal LLMs
DataFunTalk
DataFunTalk
Sep 11, 2025 · Artificial Intelligence

How AI Dressing and Multimodal Models Transform Home Service Experiences

During a pre-conference interview, AI expert Wang Mingzhong details how multimodal AI dressing, video résumé creation, short‑video templates, and interactive digital‑human live streams are technically realized for 58 Home Services, highlighting model training, workflow optimization, and future fusion of template‑based and agent‑driven video generation.

AIDigital HumanDomestic Service
0 likes · 11 min read
How AI Dressing and Multimodal Models Transform Home Service Experiences
DaTaobao Tech
DaTaobao Tech
Jul 4, 2025 · Artificial Intelligence

How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights

This article details the end‑to‑end design of Taobao Live's AI digital human system, covering six core components such as LLM‑driven content creation, interactive dialogue, TTS voice synthesis, visual synchronization, audio‑video engineering, and a scalable backend, while also discussing product evolution, automation challenges, and future roadmap.

AIAutomationDigital Human
0 likes · 19 min read
How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights
DaTaobao Tech
DaTaobao Tech
Jul 2, 2025 · Artificial Intelligence

How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations

This article presents a comprehensive overview of the AI‑driven digital‑human live‑streaming solution used by Taobao, detailing six core components—including LLM‑based content generation and interaction, TTS, visual driving, audio‑video engineering, and backend services—while sharing architectural diagrams, cost‑reduction strategies, productization insights, and future directions.

AIDigital HumanLLM
0 likes · 8 min read
How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations
Cognitive Technology Team
Cognitive Technology Team
Jul 1, 2025 · Artificial Intelligence

How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation

This article presents a comprehensive practice summary of building an intelligent digital‑human system, covering six core modules—LLM content generation, LLM interaction, TTS synthesis, visual driving, audio‑video engineering, and backend services—while detailing data collection, signal processing, ASR annotation, speaker clustering, model optimization (V1‑V4), evaluation metrics, and future research directions.

AI voiceAudio ProcessingDigital Human
0 likes · 23 min read
How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation
DaTaobao Tech
DaTaobao Tech
Jun 30, 2025 · Artificial Intelligence

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

This article outlines the end‑to‑end architecture and practical solutions behind creating intelligent digital humans for live commerce, covering LLM‑driven content generation, real‑time lip‑sync, image‑driven avatar creation, automated material review, lightweight model training, and a roadmap toward fully automated, high‑performance virtual presenters.

AIDigital HumanLLM
0 likes · 19 min read
One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech
DaTaobao Tech
DaTaobao Tech
Jun 27, 2025 · Artificial Intelligence

Building a High‑Quality Live‑Streaming Digital Human: TTS Pipeline, Data Processing, and Model Optimizations

This article details the end‑to‑end workflow for creating intelligent digital humans for live streaming, covering large‑language‑model‑driven content generation, multi‑stage TTS architecture, extensive audio‑signal processing, speaker clustering, front‑end text normalization, back‑end acoustic modeling, and quantitative evaluation of model improvements.

AIDigital HumanSpeech synthesis
0 likes · 22 min read
Building a High‑Quality Live‑Streaming Digital Human: TTS Pipeline, Data Processing, and Model Optimizations
AntTech
AntTech
Jun 25, 2025 · Artificial Intelligence

CVPR 2025: Semi-Body Digital Humans, Video Upscaling, Mobile Super‑Res

In this CVPR 2025 showcase, Ant Group presents three cutting‑edge papers—EchoMimicV2 introducing an open‑source semi‑body digital human generation framework, RivuletMLP offering an efficient MLP‑based architecture for compressed video quality enhancement, and a quantized super‑resolution model that achieves real‑time 3× upscaling on mobile NPUs.

AICVPRComputer Vision
0 likes · 6 min read
CVPR 2025: Semi-Body Digital Humans, Video Upscaling, Mobile Super‑Res
Efficient Ops
Efficient Ops
Mar 16, 2025 · Artificial Intelligence

How AI Digital Humans Transform Banking Services: Architecture, Capabilities, and Use Cases

This article explains how AI-powered digital humans can modernize banking by offering modular, multi‑modal interaction, personalized multilingual service, 24‑hour availability, and risk‑aware automation, while detailing the underlying AI foundation, decision engine, visual rendering, and deployment strategies.

AIBankingDigital Human
0 likes · 7 min read
How AI Digital Humans Transform Banking Services: Architecture, Capabilities, and Use Cases
58UXD
58UXD
Mar 14, 2025 · Product Management

How 58租房 Accelerated Landlord Publishing with LBS, OCR, and AI Guidance

This case study details how 58租房 tackled cumbersome landlord publishing by redesigning the workflow with smart location (LBS), AI‑driven shooting assistance, OCR‑based document recognition, and digital‑human guidance, achieving up to 90% faster operations, higher accuracy, and stronger privacy protection.

AI guidanceDigital HumanLBS
0 likes · 7 min read
How 58租房 Accelerated Landlord Publishing with LBS, OCR, and AI Guidance
AntTech
AntTech
Nov 27, 2024 · Artificial Intelligence

EchoMimicV2: An End-to-End Audio‑Driven Semi‑Body Human Animation Framework

EchoMimicV2, an open‑source project from Ant Group's Alipay AI team, introduces an end‑to‑end audio‑driven framework that generates high‑quality semi‑body portrait videos by jointly coordinating audio, pose, and image inputs, while addressing challenges of condition complexity, model stability, and computational cost.

Digital Humanaudio-driven animationdiffusion models
0 likes · 16 min read
EchoMimicV2: An End-to-End Audio‑Driven Semi‑Body Human Animation Framework
Alipay Experience Technology
Alipay Experience Technology
Nov 27, 2024 · Artificial Intelligence

EchoMimicV2: High‑Quality Audio‑Driven Half‑Body Human Animation with Simple Inputs

EchoMimicV2 is an open‑source digital‑human framework that generates high‑quality half‑body animation videos from a single reference image, an audio clip, and a hand‑gesture sequence, addressing challenges of facial portrait limits, complex condition injection, and inference latency in audio‑driven animation.

AI researchDigital HumanVideo Generation
0 likes · 18 min read
EchoMimicV2: High‑Quality Audio‑Driven Half‑Body Human Animation with Simple Inputs
DataFunSummit
DataFunSummit
Oct 10, 2024 · Artificial Intelligence

AIGC‑Assisted Marketing Material Generation at Shujia Technology

This article describes Shujia Technology's use of artificial intelligence to generate marketing images and videos, outlining the background, challenges of high-volume content production, detailed solutions for image and video assets—including layout models, diffusion models, and digital human synthesis—and future research directions.

AIGCDigital HumanMarketing
0 likes · 12 min read
AIGC‑Assisted Marketing Material Generation at Shujia Technology
AntTech
AntTech
Jul 24, 2024 · Artificial Intelligence

EchoMimic: An Open‑Source AIGC‑Driven Framework for 2D/3D Digital Human Generation

EchoMimic, an open‑source project from Ant Group, presents a flexible, audio‑ and pose‑driven digital human generation pipeline that combines 2D, 3D and AIGC techniques, reduces production costs, achieves real‑time inference, and includes a detailed architecture, related work analysis, and future research directions.

AIGCDigital Humanaudio-driven animation
0 likes · 18 min read
EchoMimic: An Open‑Source AIGC‑Driven Framework for 2D/3D Digital Human Generation
DaTaobao Tech
DaTaobao Tech
Jan 12, 2024 · Artificial Intelligence

AI‑Powered Photo‑to‑3D Avatar Generation in Taobao Life 2

Taobao Life 2’s new AI‑driven “photo‑face” feature automatically converts a single portrait into a stylized 3D avatar in under five seconds by using a 3D morphable model, lightweight MLP mapping, and fine‑grained attribute classification, cutting manual sculpting time from half an hour to seconds while preserving user‑specific details.

3D face reconstructionAIDigital Human
0 likes · 13 min read
AI‑Powered Photo‑to‑3D Avatar Generation in Taobao Life 2
Alibaba Terminal Technology
Alibaba Terminal Technology
Dec 6, 2023 · Frontend Development

How Galacean Powers the Asian Games’ Digital Torchbearers with Web3D Front‑End Tech

The article explains how Ant Group’s Galacean engine enables millions of participants to become digital torchbearers for the Asian Games by leveraging Web3D, physics‑based rendering, low‑code animation tools, and extensive cross‑device testing to create a scalable, interactive 3D experience.

3D renderingDigital HumanGalacean
0 likes · 5 min read
How Galacean Powers the Asian Games’ Digital Torchbearers with Web3D Front‑End Tech
58UXD
58UXD
Nov 27, 2023 · Game Development

How to Build Your Own Free Digital Human with MetaShape and MetaHuman

This step‑by‑step guide shows how to capture facial photos, process them with Metashape, create a 3D model, and import it into Unreal Engine's MetaHuman system to generate a realistic digital human without any cost.

3D scanningDigital HumanMetaHuman
0 likes · 6 min read
How to Build Your Own Free Digital Human with MetaShape and MetaHuman
DataFunTalk
DataFunTalk
Oct 6, 2023 · Artificial Intelligence

Music‑Driven Digital Human: Algorithms, System Architecture, and Practical Applications

This article presents a comprehensive overview of the Music XR Maker framework, detailing how music‑driven AI techniques enable digital human creation, dance generation, lip‑sync, and expressive performance, and discusses data pipelines, model architectures, 3D rendering, product integration, and real‑time deployment within Tencent Music’s Tianqin Lab.

AI AlgorithmsDance GenerationDigital Human
0 likes · 15 min read
Music‑Driven Digital Human: Algorithms, System Architecture, and Practical Applications
DataFunSummit
DataFunSummit
May 15, 2023 · Artificial Intelligence

Music-Driven Digital Human: Algorithms and Practices

This article presents the Music XR Maker framework and its four core components—music-driven system architecture, dance generation, lip-sync driven by singing voice, and expressive singing facial animation—detailing data sources, AI generation pipelines, 3D rendering, product applications, and future research directions.

3D renderingAIDance Generation
0 likes · 15 min read
Music-Driven Digital Human: Algorithms and Practices
Kuaishou Large Model
Kuaishou Large Model
Mar 31, 2023 · Artificial Intelligence

How Kuaishou Elevates Video Quality and AI Performance at NVIDIA GTC 2023

At NVIDIA GTC 2023, Kuaishou engineers unveiled cutting‑edge solutions ranging from video quality assessment and enhancement, 3D digital‑human live streaming, a custom TensorRT‑based performance framework, large‑scale recommendation model acceleration, to multimodal massive‑model deployment for short‑video scenarios.

AI OptimizationDigital HumanRecommendation Systems
0 likes · 9 min read
How Kuaishou Elevates Video Quality and AI Performance at NVIDIA GTC 2023
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Mar 30, 2023 · Artificial Intelligence

How Kuaishou Elevates Short‑Video Quality and AI Performance at NVIDIA GTC 2023

At NVIDIA GTC 2023, Kuaishou engineers presented cutting‑edge solutions ranging from video quality assessment and enhancement to digital‑human live streaming, custom performance‑optimization frameworks, large‑scale recommendation model acceleration, and multimodal massive‑model deployment for short‑video applications.

AI OptimizationDigital HumanMultimodal Large Models
0 likes · 9 min read
How Kuaishou Elevates Short‑Video Quality and AI Performance at NVIDIA GTC 2023
Alipay Experience Technology
Alipay Experience Technology
Mar 2, 2023 · Game Development

How Oasis Engine Created a Multi‑User Metaverse‑Style Fortune Park

This article details how the Oasis Engine team tackled the technical and artistic challenges of building a large‑scale, multiplayer, third‑person virtual park for the Alipay Fortune event, covering architecture, scene splitting, asset loading, custom materials, digital human integration, networking, behavior management, and extensive performance optimizations.

Digital HumanGame DevelopmentOasis Engine
0 likes · 24 min read
How Oasis Engine Created a Multi‑User Metaverse‑Style Fortune Park
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 28, 2023 · Artificial Intelligence

How a Dual‑Way Sign Language Digital Human Transforms Communication for the Deaf

This article describes the severe shortage of sign‑language teachers worldwide, presents user demographics, outlines the challenges of bidirectional sign‑language translation, and details the cloud‑native AI architecture, data pipeline, and real‑time recognition and synthesis techniques behind the virtual digital human "Sign Language Translator".

AIDigital HumanReal-time Processing
0 likes · 17 min read
How a Dual‑Way Sign Language Digital Human Transforms Communication for the Deaf
DataFunSummit
DataFunSummit
Jan 13, 2023 · Artificial Intelligence

2022 Digital Human System Basic Capability Evaluation and Observations

This report presents the background, methodology, evaluation model, results, and key observations of the 2022 digital human system basic capability assessment, highlighting technical, engineering, and security challenges, industry standards development, and future work to advance digital human technologies.

Capability EvaluationDigital HumanIndustry Report
0 likes · 12 min read
2022 Digital Human System Basic Capability Evaluation and Observations
DataFunTalk
DataFunTalk
Dec 20, 2022 · Artificial Intelligence

Baidu Smart Cloud Digital Human Platform: Development, Architecture, and Solution Overview

This article provides a comprehensive overview of Baidu's Smart Cloud Digital Human platform, detailing its evolution since 2019, core AI-driven architecture, platform components such as persona management and business orchestration, various industry solutions, and technical Q&A on rendering, latency, and deployment.

AI PlatformBaiduDigital Human
0 likes · 13 min read
Baidu Smart Cloud Digital Human Platform: Development, Architecture, and Solution Overview
Baidu Geek Talk
Baidu Geek Talk
Sep 7, 2022 · Artificial Intelligence

Design and Architecture of AI Digital Human Live Streaming System

The paper presents a cloud‑native architecture for AI‑driven digital‑human live‑streaming, detailing three‑layer asset, interaction, and media modules, real‑time script and Q&A scheduling, fault‑tolerant rendering and control services, and demonstrates how virtual anchors can deliver continuous, lifelike 24/7 e‑commerce streams.

AIDigital HumanSystem Architecture
0 likes · 21 min read
Design and Architecture of AI Digital Human Live Streaming System
DataFunSummit
DataFunSummit
Aug 3, 2022 · Artificial Intelligence

AliMe MKG: Multimodal Knowledge Graph for Live E‑commerce and Its Technical Exploration

This report presents AliMe MKG, a multimodal knowledge graph designed for live e‑commerce, detailing its business background, construction and application, the three types of multimodal knowledge (triples, sentences, and visual media), the underlying extraction techniques, and its deployment in digital‑human anchors and intelligent live‑room assistants.

AIDigital Humane‑commerce
0 likes · 19 min read
AliMe MKG: Multimodal Knowledge Graph for Live E‑commerce and Its Technical Exploration
DataFunSummit
DataFunSummit
Apr 14, 2022 · Artificial Intelligence

Advances in Alibaba's Digital Human Technology: Construction, Performance, Interaction, and the MMTK Multimodal Algorithm Library

This article reviews Alibaba's digital‑human (virtual avatar) research over the past few years, covering the product’s evolution, a six‑stage pipeline for building digital humans, solutions to key challenges in realism, multimodal interaction, and the open‑source MMTK algorithm library.

Digital HumanEmotion ModelingMultimodal AI
0 likes · 12 min read
Advances in Alibaba's Digital Human Technology: Construction, Performance, Interaction, and the MMTK Multimodal Algorithm Library
DataFunTalk
DataFunTalk
Mar 26, 2022 · Artificial Intelligence

Advances in Alibaba's Digital Human (XiaoMi) Technology: Development, Construction, and Interaction

This article reviews Alibaba's XiaoMi digital human technology, covering its evolution since 2019, a six‑stage pipeline for building avatars, methods to enhance emotional, textual, vocal, and motion expressiveness, and approaches for improving long‑term interactive capabilities such as controllable script generation, multimodal QA, sign‑language translation, and intelligent behavior decision, culminating in the release of the MMTK multimodal algorithm library.

Digital HumanEmotion ModelingMultimodal AI
0 likes · 11 min read
Advances in Alibaba's Digital Human (XiaoMi) Technology: Development, Construction, and Interaction
DaTaobao Tech
DaTaobao Tech
Mar 25, 2022 · Artificial Intelligence

Digital Human Technology: Design, Production, and Future Directions

The article surveys digital‑human technology—covering visual form, motion and AI‑driven intelligence—its fast‑growing market, a three‑layer solution stack (hardware/software base, AI platform, application services), end‑to‑end avatar creation workflow, rendering and animation techniques, web‑deployment challenges, and future prospects such as deeper AI, XR/6G and metaverse integration.

3D renderingAIDigital Human
0 likes · 20 min read
Digital Human Technology: Design, Production, and Future Directions
Tencent Mobility Industry Design Center
Tencent Mobility Industry Design Center
Feb 25, 2022 · Artificial Intelligence

Designing AI-Powered Digital Humans for Bank Customer Service: A Practical Guide

This article explores the classification of digital humans, examines how they transform user experience, and presents a detailed design practice for AI‑driven virtual bank tellers, covering visual identity, interaction flow, ergonomic screen layout, and future considerations for natural multimodal interfaces.

AIBankingDigital Human
0 likes · 19 min read
Designing AI-Powered Digital Humans for Bank Customer Service: A Practical Guide
Tencent Advertising Technology
Tencent Advertising Technology
Oct 26, 2021 · Artificial Intelligence

Seventh China International 'Internet+' College Student Innovation and Entrepreneurship Competition: Virtual IP Innovation Track and Tencent Advertising's Virtual Human Creative Service

The article reports on the seventh China International 'Internet+' College Student Innovation and Entrepreneurship Competition, highlighting its new industry challenge track focused on virtual IP, the award-winning solution from Beijing Institute of Technology using AI-driven digital human production, and Tencent Advertising's virtual human creative service for marketing.

AIDigital HumanInnovation Competition
0 likes · 6 min read
Seventh China International 'Internet+' College Student Innovation and Entrepreneurship Competition: Virtual IP Innovation Track and Tencent Advertising's Virtual Human Creative Service