Backend Development 16 min read

Evolution of Kuaishou's Video Infrastructure: Architecture, Fine‑grained Operations, and Infrastructure Promotion

This talk outlines the evolution of Kuaishou's video infrastructure—from its early service‑oriented design through workflow‑engine and FaaS upgrades, to fine‑grained ROI‑driven resource management and infrastructure promotion, highlighting lessons shared with Netflix and Facebook and future functional‑declaration directions.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Evolution of Kuaishou's Video Infrastructure: Architecture, Fine‑grained Operations, and Infrastructure Promotion

Over the past decade, rising device compute and network capacity have created a global video era with increasingly complex business models; large‑scale video companies now rely on robust video infrastructure to support rapid iteration.

At LiveVideoStackCon 2021, Huang Qi, Kuaishou's short‑video architecture lead, presented the core capabilities of video infrastructure, its evolution, and future expectations.

The infrastructure sits above the IaaS layer, consuming storage, compute, and bandwidth, and functions as a PaaS that provides platform support for live streaming, short video, and RTC services.

Three primary goals drive its design: (1) adapt quickly to business changes and enable high‑speed iteration, (2) achieve fine‑grained cost reduction through massive resource usage, and (3) push IaaS upgrades to remove performance and cost bottlenecks.

In the post‑pandemic era, video volume continues to grow while overall growth slows, prompting a shift from aggressive expansion to resource consolidation and efficiency improvement.

1. Video Infrastructure Basics

The first generation bundled tools such as FFmpeg, MP4Box, and Shaka into independent services, exposing them via configuration to decouple business logic.

While this reduced iteration friction, scaling the monolithic toolchain introduced maintenance overhead, complex dependency management, and blurred module boundaries.

The second generation introduced a workflow‑engine and a FaaS‑based compute platform, turning each media capability into a function within a directed‑acyclic‑graph (DAG) pipeline, dramatically improving decoupling and performance.

These upgrades yielded a >2× boost in algorithm rollout speed and a >2× improvement in developer productivity, while reducing code size to one‑third of its original footprint.

Comparisons with Netflix and Facebook show a convergent evolution path: early service encapsulation followed by workflow‑engine + function‑as‑a‑service integration.

2. Fine‑grained Operations

Resource optimization follows a three‑resource (compute, storage, bandwidth) ROI model. During a video’s “climb” phase, compute and storage are exchanged for bandwidth via high‑heat transcoding; in the “silence” phase, bandwidth is exchanged for storage reclamation.

AI‑driven playback‑volume prediction feeds an ROI calculator that decides whether high‑heat transcoding or storage reclamation is worthwhile. Deploying this logic saved ~30% of compute resources while increasing high‑compression‑rate coverage by 15 percentage points.

Similarly, a storage‑recovery ROI model identified videos whose silence‑phase storage could be reclaimed, cutting storage costs by ~50%.

3. Infrastructure Promotion

Beyond cost reduction, Kuaishou pushes foundational upgrades: expanding network nodes to cut upload latency by 20% and improve success rates, adopting new technologies such as serverless compute, mixed‑resource pools, custom KTP networking protocols, and edge‑computing for RTC.

Hardware acceleration experiments since 2020 (e.g., custom video‑codec accelerators) mirror similar efforts at Facebook and YouTube.

Looking ahead, the goal is a fully functional‑declaration‑based platform where business only specifies *what* resources are needed, leaving *how* they are produced to the underlying system, further simplifying resource management and enabling richer cross‑domain capabilities.

In summary, Kuaishou’s video infrastructure evolution demonstrates how systematic architectural decoupling, AI‑guided fine‑grained operations, and proactive infrastructure upgrades can achieve massive efficiency gains and set a roadmap for future media‑centric platforms.

backend architecturecloud computingresource optimizationmedia processingAI predictionvideo infrastructure
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.