Operations 12 min read

Technical Practices and Lessons from the 2019 Double‑11 Cat Night Live Event

The article details the technical goals, fairness mechanisms, high‑concurrency handling, dynamic routing, peak‑shaving, downstream protection, and on‑site screen interaction strategies used to deliver a seamless multi‑platform live‑stream experience for the Double‑11 Cat Night, while summarizing key takeaways for future large‑scale events.

DataFunTalk
DataFunTalk
DataFunTalk
Technical Practices and Lessons from the 2019 Double‑11 Cat Night Live Event

Author: Guo Chao; Editor: Hoh Xil; Source: Alibaba Entertainment Technology; Platform: DataFunTalk. Note: Registration for the Double‑11 Cat Night technical salon is open.

Introduction : The 2019 Cat Night extended beyond Youku to integrate with Taobao and Tmall apps, achieving multi‑screen, multi‑endpoint, bidirectional interaction and pushing internet‑based event interaction to the 3.0 era.

1. How were technical goals defined? The event aimed to provide visual spectacle, consumer benefits, and merchant sales, leading to a decomposition of goals: business support, stability, experience guarantee, full rights distribution, and no financial loss, while also fostering team growth and system consolidation.

The goal‑setting process involved industry analysis, business analysis, team capability assessment, followed by breaking down objectives, identifying key metrics and levers, and finally quantifying challenging targets with milestone countdowns.

2. How does the technology ensure a fair and consistent experience? Since the interaction spans Youku, Taobao, and Tmall apps, all platforms must launch and close interactions, display content, gameplay, and lottery timing synchronously. The solution uses a unified codebase running on all three platforms, with Youku deploying a lightweight proxy for forwarding and adaptation, while core services run in the corporate data center handling all interactive logic and rights distribution.

Four mechanisms guarantee fairness:

Client and server clocks are synchronized via CSN and wireless RPC gateway polling.

On‑site latency devices are repeatedly tested to measure network and production delays.

Operational staff coordinate event‑trigger clicks and host dialogues with the director team through rehearsals.

Based on measured delays, SEI messages are inserted into the live stream; clients parse SEI to trigger interactions at the correct moment.

3. How is high‑concurrency pulse traffic mitigated? The event experiences a steady baseline traffic plus spikes from each interactive round. The mitigation strategy includes three pillars: full‑chain load testing, application pre‑warming, and anti‑brush rate limiting. Additional optimizations specific to the event are:

Routing : By default, all requests go through the wireless RPC gateway; when pressure rises, traffic is dynamically rerouted to CSN to maintain stability.

Peak‑shaving and off‑peak : Scheduling public‑private interactions at different times, avoiding concurrent spikes during key moments (e.g., red‑packet rain), and using middleware message channels for private interactions to reduce backend pressure.

Downstream protection : Ensuring the rights platform can handle lottery calls by separating high‑value and low‑value prize pools, monitoring treasure‑box distribution in real time, and triggering pre‑emptive plans if distribution deviates from expectations.

4. On‑site large‑screen and small‑screen interaction anecdotes describe multiple rehearsal failures (touchscreen glitches, signal noise) and the layered contingency plans (from first‑level touch‑screen repair to fourth‑level backup operators) that ultimately allowed a successful bidirectional interaction during the live show.

5. How to achieve technical knowledge consolidation from a one‑day‑a‑year system? The author suggests: (a) learning to define technical goals; (b) developing technical product‑management skills to anticipate risks; (c) pursuing craftsmanship in performance and solution design; (d) building reusable tools, components, and organizational capabilities; (e) continuous post‑mortem and reflection; (f) broadening perspective beyond one’s specialty.

Conclusion : Effective pre‑planning, rapid response, and simple yet thorough contingency measures are essential for large‑scale live events. Technical teams should continuously iterate on goals, risk mitigation, and knowledge sharing to ensure reliability and growth.

Annual heavyweight technical sharing.

performance optimizationLive Streamingsystem designhigh concurrencyfairnesspre‑planning
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.