How Alibaba Ensured Fair, High‑Performance Live Streaming for Double 11 Cat Night
This article details how Alibaba’s entertainment team designed and implemented a robust, cloud‑native architecture for the 2019 Double 11 “Cat Night” live‑stream, addressing technical goals, fairness, high‑concurrency spikes, dynamic routing, peak‑shaving, downstream protection, and multi‑screen interaction to deliver a seamless consumer experience.
Background and Scale
2019 “Cat Night”, the Double 11 entertainment show, attracted viewers from over 200 countries and regions, with 51.44 million users watching the charity live stream and more than 100 million likes. The event required a one‑day system to deliver both visual spectacle and commercial value.
Defining Technical Goals
The product team set three primary objectives: provide a visual feast for consumers, deliver tangible benefits (e.g., product sales and charity rewards), and ensure merchants can successfully promote goods. These business goals were broken down into technical targets such as stability, experience consistency, full reward distribution, no financial loss, team growth, and system knowledge retention.
Goal‑Setting Process
The technical team first examined industry trends, market size, business requirements, and team capabilities. Then they decomposed the goals, identified key metrics and responsible teams, and finally quantified challenging indicators with milestone deadlines.
Ensuring Fair and Consistent Experience
Because the show ran on multiple apps (Taobao, Youku, Tmall), all interactive features had to appear simultaneously on each platform. Fairness was threatened by network latency, signal delay, and production delays. Four mechanisms were introduced:
Synchronize client and server clocks via CSN and wireless RPC gateway polling.
Measure and compensate for on‑site and production latency.
Align operation events through repeated rehearsals with hosts and directors.
Insert SEI timestamps into the live stream so the client can start interactions at the correct moment.
Handling High‑Concurrency Pulse Traffic
Baseline traffic persisted throughout the event, while each interactive round generated pulse traffic. The team applied three core tactics: full‑chain pressure testing, application warm‑up, and anti‑brush rate limiting. Additional optimizations included peak‑shaving, dynamic routing, and downstream protection.
Dynamic Routing
All requests normally passed through the wireless RPC gateway, but the system could dynamically shift a portion of traffic to CSN polling when gateway load approached limits, ensuring stability.
Peak‑Shaving and Off‑Peak Strategies
Peak‑shaving involved randomizing client submissions within a time window, limiting per‑user request rates for red‑packet rain, and pre‑fetching final‑prize queries. Off‑peak tactics separated public‑domain and private‑domain interactions in time and used middleware to reduce private‑domain pressure.
Downstream Protection
Since the reward platform was a downstream dependency, the team avoided double calls by separating high‑value and low‑value prize pools, guiding users with more treasure boxes toward the high‑value pool, and monitoring treasure‑box distribution in real time to trigger pre‑emptive measures.
On‑Site Large‑Screen and Small‑Screen Interaction
Multiple rehearsals revealed issues such as signal interference and touch‑screen failures. A four‑level contingency plan was established, ranging from pre‑event touch‑screen checks to hot‑standby machines and manual keyboard control.
Key Takeaways
Pre‑plans must be executable, rehearsed, and have backup‑of‑backup options.
Technical PM skills are essential: anticipate risks, define clear metrics, and ensure delivery.
Craftsmanship matters—optimize performance and experience without over‑engineering.
Document tools, components, and processes for future teams.
Continuous post‑mortem and knowledge sharing broaden perspective beyond one’s specialty.
Conclusion
Even a system used only once a year can achieve reliable, fair, and high‑performance operation through disciplined goal setting, rigorous testing, dynamic routing, and comprehensive contingency planning.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
