Artificial Intelligence 23 min read

Taobao Audio‑Video Technology for High‑Quality Live Streaming and Video Processing

Taobao’s audio‑video technology team delivers end‑to‑end high‑definition live‑stream and short‑video solutions—including AI‑enhanced editing, a three‑stage processing pipeline with proprietary S265/S266 codecs, sub‑second GRTN transmission, advanced player features, real‑time audio enhancement, and state‑of‑the‑art AI quality assessment—while earning awards and guiding future mobile VVC optimization.

DaTaobao Tech
DaTaobao Tech
DaTaobao Tech
Taobao Audio‑Video Technology for High‑Quality Live Streaming and Video Processing

In recent years, content‑driven e‑commerce has become an integral part of daily life, with users ordering products via live streams and short‑video links on mobile devices.

High‑definition video quality and smooth playback are essential for a good shopping experience. Taobao’s audio‑video technology team has built a suite of advanced solutions to address these requirements.

Video Production : The team provides editing tools such as portrait beautification, noise reduction, color enhancement, super‑resolution, and AI‑driven filters, supporting both live‑stream and short‑video creation on iOS, Android, and PC platforms.

Video Processing (TMPS) : After upload, content passes through a three‑step pipeline – decoding, enhancement (STaoVideo with denoising, color/brightness boost, super‑resolution, HDR, and deep‑learning operators), and re‑encoding using the proprietary S265 and S266 codecs, which achieve >40% bitrate reduction while preserving visual quality.

Video Transmission : To meet low‑latency demands, Taobao and Alibaba Cloud co‑developed the GRTN network, enabling sub‑second end‑to‑end latency for live streams and supporting features such as multi‑anchor PK events.

Video Presentation : Player architecture is upgraded to improve hardware decoding coverage, adaptive bitrate selection, and support for VR/AR and HDR playback, while also adding interactive features.

Audio End‑to‑End : A 3A SDK (Acoustic Echo Cancellation, Adaptive Noise Reduction, Automatic Gain Control) and a deep‑learning based MD‑AQA model provide real‑time audio quality monitoring and improvement.

AI‑Based Quality Assessment : The team developed MD‑VQA for no‑reference video quality assessment, achieving state‑of‑the‑art performance on public datasets (LIVE‑WC, YT‑UGC+) and internal TaoLive data. MD‑VQA is deployed across Taobao Live, information flow, and other Alibaba services for continuous quality monitoring. Building on MD‑VQA, TB‑VQA won the CVPR NTIRE 2023 video quality competition, and FACE‑VQA evaluates portrait beautification quality.

Research Achievements : Taobao’s codecs (S265, S266) have won multiple awards in the MSU world video encoder contests, outperforming X265 and VTM on PSNR, SSIM, and other metrics. The team also contributes to open‑source standards and publishes papers at top conferences such as CVPR 2023.

Future work focuses on fine‑grained video enhancement for specific scenarios, further optimization of VVC (S266) for mobile devices, and expanding AI‑driven quality assessment across the Alibaba ecosystem.

Live StreamingAIquality assessmentaudio videocontent e-commercevideo encoding
DaTaobao Tech
Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.