How Tencent CDN Handles Tb‑Level Traffic Bursts with a Docker‑Powered Burst Pool
This article explains how Tencent CDN tackles ever‑growing Tb‑scale traffic spikes by virtualizing resources into a shared Docker‑based burst pool, detailing the challenges, architectural solutions, technical optimizations, and the resulting cost savings and rapid scaling capabilities.
Background
With the rapid upgrade of home broadband and mobile networks, traffic bursts in CDN services now frequently reach terabit (Tb) levels, sometimes up to 10 Tb. Supporting such spikes quickly and cost‑effectively is a major challenge for CDN operators.
Challenges and Problems
CDN traffic bursts are characterized by:
Large volume: Most burst traffic exceeds 1 Tb, with some events reaching 10 Tb.
Diverse scenarios: Includes hot video on demand, news spikes, live game streams (e.g., LOL, DOTA2), sports events, concerts, and large‑scale downloads.
Irregular timing: Many bursts cannot be predicted in advance.
These characteristics demand more resources, flexible provisioning, and fast scaling. Traditional approaches of reserving massive resources lead to high costs and waste. Simple resource reuse faces two problems:
Only partial resources can be reused because different services have distinct requirements (e.g., storage for VOD, CPU for static pages).
Cost reduction is limited, especially for services with predictable off‑peak periods.
Solution
Tencent CDN built a virtualized, Docker‑based “burst pool” that is shared across all platforms. The pool provides a 10 Tb bandwidth reserve and can be expanded within ten minutes via an automated provisioning interface.
Burst Pool Architecture
The architecture consists of:
Burst pool: Docker VMs on top of physical machines, with controlled CPU, memory, and disk usage to avoid affecting host machines.
Automated deployment and monitoring: Predicts demand, expands resources within ten minutes, and distributes hot files for VOD/download services to reduce origin bandwidth.
Scheduling system: Uses a “direct‑pass” (直通车) scheduler that can quickly return a 302 redirect to the appropriate CDN node, achieving minute‑level response times.
Each VM and physical host reports load metrics every minute. The monitoring system compares current bandwidth with predicted values; if usage exceeds 150 % of the forecast, the system automatically allocates additional resources from the burst pool.
Technical Optimizations
To ensure isolation and stability, several techniques were applied:
Precise load control: A quota system limits CPU, I/O, and bandwidth per VM; overload requests receive a 302 redirect.
NIC flow control: In extreme cases, the virtual NIC drops packets to protect the host.
Disk size limitation: Loop devices are used to enforce per‑user or per‑group disk quotas within Docker’s ext3/ext4 filesystems.
CPU binding: Scripts collect per‑CPU load every minute, compute a 15‑minute average, and bind VMs to less‑loaded cores via cpuset.cpus, minimizing impact on the host.
Results
After launching the burst pool, Tencent CDN supported large‑scale events such as King of Glory downloads, NBA live streams, and KPL/LPL game broadcasts, saving approximately 20 million CNY in costs. The shared buffer and rapid scaling dramatically improved burst handling capability while reducing expenses.
Conclusion
By leveraging Docker virtualization, Tencent CDN created a Tb‑level burst pool that supports live, on‑demand, and static services, automatically detects burst demand, and completes resource expansion within ten minutes. The approach offers fast deployment, low cost, and high scalability, while emphasizing the need for real‑time monitoring, isolation, and future improvements such as kernel‑level container isolation and support for domains that cannot handle 302 redirects.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
