How Alibaba Powered Double 11 with Real‑Time Big Data Processing
Alibaba’s Double 11 live‑data dashboards required ultra‑high‑precision, low‑latency real‑time processing of billions of events, and the article explains the end‑to‑end architecture—including DRC, TimeTunnel, Galaxy, OTS, XTool, and OneService—used to achieve million‑plus QPS, fault‑tolerance, and flexible data collection.
Real‑time Computing Architecture for Double 11 Data Dashboards
In 2016, Alibaba’s Double 11 live‑data screens served three major audiences—media, merchants, and internal operations—each demanding extremely high data precision, throughput up to 120 k transactions per second, and zero‑error latency.
Each screen required real‑time aggregation of massive traffic data, including visitor counts, add‑to‑cart numbers, hot‑selling items, traffic sources, and per‑product metrics, while supporting cross‑dimensional analysis for dozens of business units.
Overall Real‑time Processing Pipeline
Key components include:
DRC (Data Replication Center) : proprietary data‑flow service for heterogeneous database replication and change‑data capture.
TT (TimeTunnel) : scalable Pub/Sub messaging platform based on a producer‑consumer‑topic model.
Galaxy : global stream‑processing engine delivering millisecond‑level latency and trillion‑level daily message volume.
OTS (Open Table Service) : massive structured and semi‑structured storage with strong consistency and transactional support.
XTool – Real‑time Aggregation Component
XTool wraps Storm/Trident operators into a configurable XML‑driven topology, providing deduplication, sum, count, max/min, average, ranking, multi‑table joins, dimension tables, and window management without writing code. It also supports exactly‑once processing, replay, and automatic checkpoint handling.
Performance optimizations such as hash‑bucketed deduplication, Bloom‑filter caching, LRU result caching, and local‑or‑shuffle execution enable millions‑to‑tens of millions QPS with low latency and high stability.
OneService – Unified Data Service Platform
OneService offers three layers: simple data query (HBase/MySQL/Phoenix/OpenSearch), complex query (OneID, GProfile), and real‑time push (JSONP/WebSocket), with high‑availability features like multi‑datacenter deployment and rapid failover.
TimeTunnel Pub/Sub Service
TT stores streams in HBase, providing high read‑write ratios, elastic scaling, and time‑based seeking. It supports data landing for replay and decoupling front‑end and back‑end systems, handling over 100× read‑write disparity during peak traffic.
Data Collection
Real‑time ingestion combines DRC change capture, log agents on hundreds of thousands of servers, OTS integration, and SDK writes. Buffering and compression mitigate long‑distance bandwidth limits, while configurable parameters balance latency, performance, and power consumption.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
