Understanding Storm: A Distributed Real-Time Computation System
The article explains the need for low‑latency, high‑performance, distributed real‑time processing, outlines the challenges such systems must address, and introduces Storm as a Hadoop‑like framework for stream processing, detailing its architecture, fault‑tolerance mechanisms, transactional topology, and large‑scale deployment at Taobao.
With the rapid growth of information, users now expect instant access to up‑to‑date data; batch‑oriented systems like Hadoop cannot meet this demand, leading to the need for a real‑time computation platform.
The goal is to build a real‑time computing system that satisfies five key requirements: low latency, high performance, distributed execution, scalability, and fault tolerance.
A simple approach is to combine a message queue with worker processes distributed across machines, while also ensuring easy application development, no message loss, and strict message ordering.
Storm is introduced as a solution: a distributed real‑time computation system analogous to Hadoop for batch processing, providing simple primitives for stream processing.
Typical Storm use cases include continuous stream data processing and distributed RPC services, where low latency is critical.
Basic concepts
Hadoop
Storm
System role
JobTracker
Nimbus
TaskTracker
Supervisor
Child
Worker
Application name
Job
Topology
Component interface
Mapper/Reducer
Spout/Bolt
Key Storm components:
Nimbus : responsible for resource allocation and task scheduling.
Supervisor : launches and stops Worker processes assigned by Nimbus.
Worker : runs the actual processing logic of Spouts and Bolts.
Task : a thread executing a Spout or Bolt; after Storm 0.8, multiple tasks may share a physical thread called an executor.
Topology : a real‑time application composed of interconnected Spouts and Bolts.
Spout : the source of data streams, actively emitting tuples via the nextTuple() method.
Bolt : processes incoming tuples, performing filtering, aggregation, database writes, etc., via the execute(Tuple input) method.
Tuple : the basic unit of message passing, essentially a list of values.
Stream : a continuous flow of tuples.
Storm supports various stream groupings (shuffle, fields, hash, all, global, none, direct, localOrShuffle) to control how tuples are partitioned among bolts.
Record‑level fault tolerance
Storm assigns a 64‑bit message ID (root ID) to each source tuple and tracks all downstream tuple IDs using an acker component. By XOR‑ing the IDs, the acker can determine when a message has been fully processed; if the XOR result returns to zero, the message is considered complete.
Although a rare collision of tuple IDs could cause a false‑positive completion, the probability is negligible in practice.
Transactional topology (Trident)
Storm 0.7 introduced transactional topologies to guarantee exactly‑once processing for strict use cases; this feature has been refined in Storm 0.8 as Trident, offering a more convenient API.
Storm at Taobao
Taobao uses Storm extensively for real‑time log processing, risk control, and recommendation, handling millions to billions of messages daily (TB‑scale data). The typical pipeline reads logs from Kafka‑like MetaQ or HBase‑based timetunnel, processes them with Storm, and writes results to distributed storage for downstream services.
Future outlook
Storm continues to evolve with features such as stateful processing, Trident, custom schedulers, and integration with projects like Mesos, Kryo, Disruptor, and Curator, aiming toward a robust 1.0 release.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
