How Weibo Handles Billion‑Scale Short Video Traffic: High‑Concurrency Architecture Deep Dive

This article explains how Weibo's video team designs a highly available, high‑concurrency architecture for short‑video services, covering team responsibilities, business scenarios, microservice design, caching layers, multi‑data‑center HA, and circuit‑breaker mechanisms to sustain unpredictable traffic spikes.

21CTO
21CTO
21CTO
How Weibo Handles Billion‑Scale Short Video Traffic: High‑Concurrency Architecture Deep Dive

Team Introduction

We are a technical team within Weibo's R&D video platform, responsible for core video services such as video posts, "Weibo Stories", short videos, and live streaming, as well as the underlying video platform infrastructure (file platform, transcoding, scheduling, media library).

Our goal is to enable Weibo to handle millions of daily video increments and diverse custom requirements.

Business Scenario

The video service must cope with sudden traffic surges caused by hot events (e.g., celebrity rumors, breaking news). Simple server scaling is insufficient because over‑provisioning wastes resources during low traffic, while under‑provisioning risks crashes during spikes.

These surges are unpredictable, unlike scheduled high‑traffic events like "Double 11".

"Weibo Stories" Architecture Design

The service is built as a microservice system. The interface layer mixes Web API and internal RPC calls. A façade layer aggregates several vertical microservices, each exposing specific functionality, while dependent services (e.g., user follow) are accessed via RPC from other departments. The storage layer combines cache and database.

Technical Challenges

Estimating the QPS for a typical scenario (500 followed friends, 100 k homepage refreshes per second) yields 50 million requests per second, not counting expiration checks and ordering, far exceeding naive designs.

Solution Comparison

We considered two Feed models:

Feed Push Model : pushes each new video to every follower, which becomes infeasible when a user has tens of millions of followers.

Feed Pull Model : followers pull the latest videos on demand. Given Weibo's massive user base and the need for consistency, we chose the Pull model.

Feed Pull Model Implementation

We employ a distributed cache with sharding and hash‑based partitioning, followed by slice‑level access.

Distributed Cache Architecture

We use a three‑level cache: L1 (hot cache, ~200 MB, LRU eviction), Master (≈4 GB), and Slave (≈6 GB). L1 handles the hottest data and can be horizontally scaled quickly during spikes. Master/Slave provide larger capacity to avoid cold‑data misses.

Cache nodes are deployed across two IDC data centers (IDC‑A and IDC‑B) with master‑slave synchronization, forming an HA multi‑data‑center setup. Synchronization ensures consistent hot‑data metrics across sites.

Cache Technology Choice

We selected a custom MC cache over Redis because MC offers higher throughput for simple key‑value access at massive scale, despite its limitations with highly mutable data.

Elastic Scaling Platform (DCP)

Our self‑developed DCP platform provides both scheduled (peak‑hour) and on‑demand elastic scaling. When internal resources are exhausted, we seamlessly integrate Alibaba Cloud resources for hybrid‑cloud scaling.

Microservice Circuit‑Breaker Mechanism

A circuit‑breaker, similar to an electrical fuse, monitors service load (e.g., 3000 QPS threshold). If a service exceeds the limit, it is temporarily disabled, protecting the rest of the system. After load drops or scaling completes, the service is restored.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendMicroservicesScalabilitycachinghigh concurrencyHA
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.