How Dianping Scales Real‑Time Analytics with Apache Storm
This article explains how Dianping built a millisecond‑level real‑time computation platform using Apache Storm, covering use cases, system architecture, core Storm concepts, performance tuning, best practices, and a detailed Q&A on their production deployment.
Real‑Time Computing Use Cases at Dianping
Dianping applies real‑time analytics to dashboards, daily active users, new activations, transaction volume, search, recommendation, and security, among other scenarios.
Dashboard: Beidou reporting, WeChat public account, CloudMap traffic analysis
Real‑time DAU: Main app (Android/iPhone/iPad), group app, quick lookup, PC, mobile site
New activations: Main app
Real‑time transaction amount: Flash discount and group‑buy transactions
In search, user actions such as clicks instantly affect ranking, improving conversion rates; similar feedback loops boost group‑buy revenue by over 2%.
Industry Use Cases
Alibaba JStorm: Double‑11 real‑time transaction data
360 Storm: Ticket‑booking captcha recognition, cloud‑disk thumbnail generation, intrusion detection, hot‑word recommendation
Tencent TDProcess: Distributed KV store TDEngine with stream processing, providing Sum, Count, PV/UV, TopK statistics
JD.com Samza: Real‑time order status aggregation across multiple stages
How Dianping Built Its Real‑Time Platform
The platform is an end‑to‑end solution covering data sources, transport channels, computation, storage, and external services. Data is captured at millisecond latency via custom spouts (Blackhole, Puma, Swallow) that ingest logs, database changes, and MQ messages.
Blackhole is a Kafka‑like system for high‑throughput log ingestion; Puma reads MySQL binlogs; Swallow is Dianping’s MQ. Developers write business logic without handling data acquisition.
Computation runs on Apache Storm. Results are exposed through a data‑service built on Dianping’s RPC framework, abstracting underlying Redis/HBase storage and allowing seamless integration with online services.
Storm Basics
Apache Storm is an open‑source distributed real‑time computation system originally from Twitter. Unlike Hadoop’s batch‑oriented MapReduce, Storm runs topologies continuously.
A topology consists of spouts (data sources) and bolts (processing units). Nimbus and Supervisors coordinate via ZooKeeper; both are stateless and fail‑fast.
Spouts emit tuples via nextTuple; bolts process them in execute. Acker ensures exactly‑once processing, while the system provides fault tolerance and automatic worker replacement.
Storm Advantages
Ease of Use : Follow the simple Spout‑Bolt programming model.
Scalability : Increase parallelism to linearly boost performance.
Fault Tolerance : Workers are automatically replaced on failure.
Accuracy : Acker and transactional mechanisms prevent data loss and ensure correctness.
Performance Tuning and Best Practices
Use component parallelism instead of internal thread pools.
Avoid DRPC for large‑scale batch processing; prefer Spark Streaming for heavy workloads.
Keep spout logic lightweight; long‑running operations should be moved downstream.
Balance fieldsGrouping to avoid data skew.
Prefer localOrShuffleGrouping for intra‑worker communication.
Set an appropriate maxSpoutPending to control memory usage.
Choose a sensible number of workers; too many can degrade throughput due to extra serialization and thread‑switch overhead.
Adjust Netty parameters ( storm.messaging.netty.*) to balance latency and throughput.
Future Directions
Dianping plans to unify real‑time, near‑real‑time, and offline batch processing on a single developer platform, leveraging Storm for sub‑second latency, Spark Streaming for minute‑level processing, and Hadoop/Hive for daily batch jobs.
Q&A Highlights
What differentiates Blackhole and Swallow? Blackhole focuses on high‑throughput log streams; Swallow handles MQ messages and persists them in MongoDB.
How is data extracted without impacting production? Extraction runs in a side‑channel (e.g., log files, MQ) so it does not affect the main business flow.
What storage backs Storm results? Primarily Redis, with HBase and MySQL as secondary stores, and some results pushed to MQ.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
