Shopee Video Technology: Backend Services, High‑Definition Low‑Bitrate Optimization, and Performance Enhancements
Shopee’s video platform combines live‑stream and on‑demand transcoding, link‑mic, multi‑party mixing, and backend editing services with a proprietary high‑definition low‑bitrate pipeline that leverages GPU and CPU encoders, AI‑enhanced pre‑processing, hierarchical B‑frames, and SIMD‑optimized sharpening to deliver high‑quality video on low‑end devices while cutting compute costs, and the company is actively recruiting engineers for further development.
Background
Shopee’s rapid expansion in Southeast Asian markets has created a strong demand for video‑enabled e‑commerce experiences. Many users operate low‑end smartphones and face highly variable network conditions, making stable, high‑quality video delivery a major challenge.
The company therefore built a suite of video‑related products and a set of backend services to handle massive live‑streaming and on‑demand video workloads while keeping compute costs under control.
Shopee Video Product Landscape
The Shopee App provides feeds, live‑shopping, and on‑demand video services. A short‑video offering (Shopee Video) is already live in several markets. SeaBank uses video for online account opening, and the internal communication tool SeaTalk is planning to add video‑conference capabilities.
Backend Services Overview
3.1 Live / VOD Transcoding
Shopee operates two transcoding platforms (live and VOD). To meet the strict 33 ms per‑frame budget for live streams, the pipeline is split into short‑latency nodes (processed serially in a single region) and long‑latency nodes (processed in separate regions and executed in parallel). This pipeline ensures that the slowest node stays within the frame duration.
The architecture includes internal Prado container clusters and cloud‑hosted clusters, with the upstream MMS VOD platform dynamically dispatching jobs to either cluster based on load.
3.2 Live Link‑Mic
Live link‑mic (host‑guest interaction) uses an RTC‑SFU service for bidirectional video streams, while viewers receive the stream via HTTP‑FLV. When only a single host is present, Shopee performs direct remuxing of H.264 video; when a guest joins, the system switches to mixed‑stream transcoding. The MCU caches GOPs to enable seamless transitions between single‑host and mixed‑host modes, dramatically reducing CPU load and allowing a single machine to support up to 20 concurrent hosts (or >200 hosts when transcoding is disabled).
3.3 Multi‑Party Conference Mixing
Shopee’s internal communication tool SeaTalk will eventually use a multi‑party mixing service built on open‑source OWT and mediasoupclient components, with a three‑frame buffer to smooth mixed video frames. The service supports both RTMP and WebRTC inputs.
3.4 Backend Video Editing
A backend editing service for Shopee Video provides 2D effects such as image‑sequence‑to‑video, background music, clipping, text animation, transitions (via gltransition), and background blur, executed on CPU using Xvfb virtual displays.
High‑Definition Low‑Bitrate (HD‑LB) Strategy
Live‑shopping streams in Southeast Asia often run at 360p–270p with bitrates of 300–500 kbps. To improve visual quality without increasing bandwidth, Shopee developed an in‑house HD‑LB transcoding pipeline that uses NVIDIA T4 GPUs for normal transcoding and a CPU‑based, x264‑derived encoder for HD‑LB streams.
Comparisons show Shopee’s HD‑LB output surpasses two major cloud providers in both overall quality and block‑artifact handling.
General Video Processing Flow
Decode to YUV.
Pre‑processing (ROI background Gaussian blur, sharpening, AI enhancement).
Pre‑encoding steps (down‑sampling, scenecut detection, frame‑type decision, AC energy, MBTree).
Encoding (intra/inter prediction, RDO, deblocking, reference‑frame management).
Quantization and entropy coding to produce NALU units.
Optimization Details
Pre‑Processing
CDEF Algorithm : Implemented as an FFmpeg filter based on AV1’s CDEF to reduce ringing artifacts.
3D Denoising : Reuses motion vectors from the encoder to apply bilateral filtering, achieving high quality with low computational cost.
Classification Parameters
Shopee groups eight cost‑effective parameter sets (B‑frame count, B‑frame decision, B‑pyramid, hierarchy, QComp, etc.) and uses a trained model to classify videos, gaining up to 2.6 % BD‑rate improvement.
Encoder Optimizations
VBV‑Adapt CRF : Dynamically adjusts CRF based on buffer level to keep average quality within VBV limits, yielding a 1.2 % BD‑rate gain.
Hierarchical B‑Frames + Temporal Filtering : Introduces deeper B‑frame hierarchies and a post‑encoding temporal filter, improving BD‑rate by ~2 % and increasing frame‑rate.
ROI with Gaussian Blur : Applies Gaussian blur to non‑ROI regions before encoding, reducing visible block artifacts while preserving ROI quality.
Long‑Term Reference Frames : Stores frames before a scenecut as long‑term references, allowing more efficient encoding of ad‑insertion segments (≈6 % BD‑rate gain).
Hierarchical RDO : /* mbrd == 1 -> RD mode decision */ /* mbrd == 2 -> RD refinement satd cost */ /* mbrd == 3 -> QPRD */ Enables selective QP‑RD to achieve ~3 % BD‑rate improvement.
Temporal SVC for RTC : Introduces layered P‑frames that reference lower layers, allowing selective dropping of higher‑layer P‑frames under bandwidth constraints.
Performance Optimizations
Encoder‑Side Sharpening
Mobile devices struggled with low‑light text clarity. A USM‑based sharpening filter was optimized with NEON SIMD assembly, boosting frame‑rate by 7× and eliminating overheating.
One‑to‑Many Encoding
To produce six output renditions per VOD asset, Shopee reuses pre‑processing results, encoder look‑ahead data, MBTree, and motion vectors across renditions, cutting CPU usage by up to 50 % for the shared portions.
Future work includes extending these techniques to x265 and adding H.265 support for additional video services.
Authors & Recruitment
Zhixing, Shopee Multimedia Center.
Shopee Multimedia Center is actively hiring engineers in video communication, network protocols, transport optimization, codec algorithms, and computer‑vision. Interested candidates can apply via the official recruitment portal or email [email protected].
Shopee Tech Team
How to innovate and solve technical challenges in diverse, complex overseas scenarios? The Shopee Tech Team will explore cutting‑edge technology concepts and applications with you.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.