Cloud Native 12 min read

How Douyin Handled 70B Red Packet Interactions in 27 Days with Cloud‑Native Magic

In just 27 days, Douyin and Volcano Engine's cloud‑native team built a Kubernetes‑based, elastically scalable infrastructure that supported 703 billion red‑packet interactions and over a trillion live‑stream views during the 2021 Spring Festival Gala, ensuring zero downtime and seamless user experience.

Volcano Engine Developer Services
Volcano Engine Developer Services
Volcano Engine Developer Services
How Douyin Handled 70B Red Packet Interactions in 27 Days with Cloud‑Native Magic

27 Days of Technical Miracle

During the 2021 Spring Festival Gala, Douyin generated 703 billion red‑packet interactions and 1.221 billion live‑stream views. The Volcano Engine cloud‑native team built the infrastructure that kept the service stable and responsive.

Extreme Elastic Cloud‑Native Infrastructure

12 × 10⁴ servers were provisioned in 27 days. Two key factors made it possible: (1) adopting Kubernetes and containers as the unified runtime for almost all stateless services, achieving full cloud‑native deployment; (2) developing a highly elastic scaling capability derived from Douyin and Toutiao production practice.

Offline resource borrowing: idle machines used for offline tasks (e.g., model training) were repurposed within five minutes for online workloads.

Online mixed‑tenant sharing: surplus capacity from other services was allocated to the gala workload using FaaS + Virtual Kubernetes and warm‑up pools.

Resource utilization diagram
Resource utilization diagram

High‑Performance Storage for Billions of Requests

To handle 70.3 billion red‑packet interactions, a self‑developed Redis system provided massive caching, supporting over 2.5 PB of data. A storage‑compute separation architecture placed persistent data in pooled storage while keeping compute nodes stateless.

The team also deployed a self‑built cloud database with multi‑master, RDMA‑accelerated hardware, achieving tens of millions of QPS. Additionally, a distributed graph database (ByteGraph) and an object storage service stored petabytes of short‑video content.

Distributed storage diagram
Distributed storage diagram

Edge‑Centric Traffic Engineering

Traffic peaked at three times the normal daily load. The team used a custom edge‑cloud acceleration line, multi‑level traffic grading, automatic disaster‑recovery, and fine‑grained IP‑based routing to balance load across edge aggregation sites and core IDC.

Conclusion

The 27‑day effort demonstrated that cloud‑native design, elastic resource management, and coordinated multi‑region deployment can meet extreme online‑service demands, and Volcano Engine plans to open this platform to external enterprises.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeEdge Computinghigh concurrencydistributed storageelastic scaling
Volcano Engine Developer Services
Written by

Volcano Engine Developer Services

The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.