Big Data 23 min read

How Dianping Scales Real‑Time Analytics with Apache Storm

This article explains how Dianping built a millisecond‑level real‑time computation platform using Apache Storm, covering use cases, system architecture, core Storm concepts, performance tuning, best practices, and a detailed Q&A on their production deployment.

21CTO
21CTO
21CTO
How Dianping Scales Real‑Time Analytics with Apache Storm

Real‑Time Computing Use Cases at Dianping

Dianping applies real‑time analytics to dashboards, daily active users, new activations, transaction volume, search, recommendation, and security, among other scenarios.

Dashboard: Beidou reporting, WeChat public account, CloudMap traffic analysis

Real‑time DAU: Main app (Android/iPhone/iPad), group app, quick lookup, PC, mobile site

New activations: Main app

Real‑time transaction amount: Flash discount and group‑buy transactions

In search, user actions such as clicks instantly affect ranking, improving conversion rates; similar feedback loops boost group‑buy revenue by over 2%.

Industry Use Cases

Alibaba JStorm: Double‑11 real‑time transaction data

360 Storm: Ticket‑booking captcha recognition, cloud‑disk thumbnail generation, intrusion detection, hot‑word recommendation

Tencent TDProcess: Distributed KV store TDEngine with stream processing, providing Sum, Count, PV/UV, TopK statistics

JD.com Samza: Real‑time order status aggregation across multiple stages

How Dianping Built Its Real‑Time Platform

The platform is an end‑to‑end solution covering data sources, transport channels, computation, storage, and external services. Data is captured at millisecond latency via custom spouts (Blackhole, Puma, Swallow) that ingest logs, database changes, and MQ messages.

Blackhole is a Kafka‑like system for high‑throughput log ingestion; Puma reads MySQL binlogs; Swallow is Dianping’s MQ. Developers write business logic without handling data acquisition.

Computation runs on Apache Storm. Results are exposed through a data‑service built on Dianping’s RPC framework, abstracting underlying Redis/HBase storage and allowing seamless integration with online services.

Storm Basics

Apache Storm is an open‑source distributed real‑time computation system originally from Twitter. Unlike Hadoop’s batch‑oriented MapReduce, Storm runs topologies continuously.

A topology consists of spouts (data sources) and bolts (processing units). Nimbus and Supervisors coordinate via ZooKeeper; both are stateless and fail‑fast.

Spouts emit tuples via nextTuple; bolts process them in execute. Acker ensures exactly‑once processing, while the system provides fault tolerance and automatic worker replacement.

Storm Advantages

Ease of Use : Follow the simple Spout‑Bolt programming model.

Scalability : Increase parallelism to linearly boost performance.

Fault Tolerance : Workers are automatically replaced on failure.

Accuracy : Acker and transactional mechanisms prevent data loss and ensure correctness.

Performance Tuning and Best Practices

Use component parallelism instead of internal thread pools.

Avoid DRPC for large‑scale batch processing; prefer Spark Streaming for heavy workloads.

Keep spout logic lightweight; long‑running operations should be moved downstream.

Balance fieldsGrouping to avoid data skew.

Prefer localOrShuffleGrouping for intra‑worker communication.

Set an appropriate maxSpoutPending to control memory usage.

Choose a sensible number of workers; too many can degrade throughput due to extra serialization and thread‑switch overhead.

Adjust Netty parameters ( storm.messaging.netty.*) to balance latency and throughput.

Future Directions

Dianping plans to unify real‑time, near‑real‑time, and offline batch processing on a single developer platform, leveraging Storm for sub‑second latency, Spark Streaming for minute‑level processing, and Hadoop/Hive for daily batch jobs.

Q&A Highlights

What differentiates Blackhole and Swallow? Blackhole focuses on high‑throughput log streams; Swallow handles MQ messages and persists them in MongoDB.

How is data extracted without impacting production? Extraction runs in a side‑channel (e.g., log files, MQ) so it does not affect the main business flow.

What storage backs Storm results? Primarily Redis, with HBase and MySQL as secondary stores, and some results pushed to MQ.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big Datastream processingReal-time analyticsperformance tuningApache Storm
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.