Inside Alibaba’s Stream Computing: 4.72 B Events/sec & 25.6 K Payments/sec on Double 11
Alibaba’s Double 11 showcase reveals how its upgraded stream computing platform handled a 100% year‑over‑year data surge, achieving 256 K successful payments per second and processing 472 million events per second in real time through a highly optimized Flink‑based architecture.
Alibaba Double 11 Stream Computing Highlights
During this year's Double 11, Alibaba achieved a payment‑success peak of 256,000 transactions per second and a real‑time data‑processing peak of 472 million events per second.
Why Stream Computing Matters
Data loses business value quickly, so processing must be near‑real‑time. The massive traffic surge required a robust stream‑processing solution.
Year‑over‑Year Comparison
2016: payment peak 120,000/s, total data processing 93 million/s. 2017: payment peak 256,000/s, real‑time data processing 472 million/s, public data layer 180 million/s.
Real‑Time Data Flow Architecture
Incremental data is collected via DRC and LogAgent, sent to DataHub (a Pub/Sub service). Flink jobs subscribe to these streams, perform ETL, and write results back to DataHub, HBase, MySQL, etc., exposed through the One Service data layer.
Engine Upgrades and Optimizations (2017)
Engine migration: from Storm to Blink, doubling peak capacity and increasing stability five‑fold.
State management: moved from HBase to RocksDB, reducing network overhead and supporting billions of keys.
Checkpoint and compaction: incremental checkpoints and RocksDB tuning lower I/O pressure.
Asynchronous sink: improves CPU utilization and TPS.
Common Components and the “ChiTu” Platform
Alibaba built a generic aggregation component on Blink, allowing jobs to be defined via JSON, reducing development effort from 10 person‑days to 0.5 person‑days.
The “ChiTu” platform generates real‑time tasks without writing code, offering built‑in statistics components, data management, and reporting integration.
Key Optimizations
Dimension merging to cut network traffic by >50%.
RocksDB key‑value encoding halves storage size.
High‑performance sorting using in‑memory priority queues and MapState, boosting speed ~10×.
Mini‑batch sink for bulk writes to HBase/DataHub, improving throughput.
Data Assurance
High‑priority services run 24/7 with cross‑region disaster recovery; a dynamic configuration in One Service enables second‑level failover.
Future Directions
Platform‑as‑a‑service for stream processing.
Unified semantic layer with Apache Beam, Flink Table API, and Stream SQL.
Integration of real‑time intelligence and deep learning.
Convergence of real‑time and batch processing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
