Big Data 19 min read

Performance Comparison of Apache Flink and Apache Storm for Real-Time Stream Processing

The study benchmarks Apache Flink against Apache Storm on a shared cluster, showing Flink delivering three‑to‑five times higher throughput and roughly half the latency across simple, sleep‑induced, and windowed workloads, with modest throughput loss for exactly‑once semantics, leading to a recommendation of Flink for high‑performance, stateful real‑time stream processing.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Performance Comparison of Apache Flink and Apache Storm for Real-Time Stream Processing

Apache Flink and Apache Storm are two widely used distributed real‑time computation frameworks. Storm has been maturely deployed in Meituan‑Dianping’s real‑time computing services, while Flink has recently attracted attention for its high throughput, low latency, strong reliability and precise computation capabilities.

The goal of this study is to become familiar with the Flink framework, verify its stability and reliability, evaluate its real‑time processing performance, identify shortcomings, locate performance bottlenecks, and provide data‑driven recommendations for framework selection, resource planning, and performance tuning.

Test Objectives

Evaluate the performance of Flink and Storm under various scenarios and data pressures, obtain detailed performance data, explore the impact of different configurations on Flink, and derive tuning suggestions.

Test Scenarios

Simple "input‑output" processing to isolate framework overhead.

Long‑running user jobs (simulated by a 1 ms sleep) to assess the effect of complex logic.

Windowed statistics (e.g., count per time window) to compare window support.

Exactly‑once vs. at‑least‑once delivery semantics.

Performance Metrics

Throughput (records/second).

Latency (milliseconds, measured as outTime − eventTime or outTime ‑ inTime).

Test Environment

A standalone cluster with one master and two workers was built for both Storm and Flink. Some tests also ran on YARN.

Test Methodology

Data were generated at a controlled rate and written to a Kafka topic. Storm and Flink tasks consumed the same offsets, processed the data, and wrote results with timestamps to separate Kafka topics. A Metrics Collector read the output topics, computed per‑five‑minute averages for throughput and latency percentiles, and stored the results in MySQL for analysis.

Key Results

Identity (single‑thread) Throughput

Storm: ~87 k records/s.

Flink: ~350 k records/s (3‑5× Storm).

Identity (single‑thread) Latency

At high load, Storm median latency ≈ 100 ms, 99th percentile ≈ 700 ms.

Flink median latency ≈ 50 ms, 99th percentile ≈ 300 ms.

Sleep Scenario Throughput

Both frameworks achieve ~900 records/s per thread; throughput scales linearly with concurrency.

Sleep Scenario Latency (median)

Flink latency remains lower than Storm.

Windowed Word Count Throughput

Storm: ~12 k records/s.

Flink (standalone): ~43 k records/s (≈ 3× Storm).

Exactly‑once vs. At‑least‑once (Flink) Throughput

Exactly‑once throughput drops ~6.3 % compared with at‑least‑once under the same concurrency.

Storm At‑most‑once vs. At‑least‑once Throughput

At‑most‑once improves throughput by ~16.8 % over at‑least‑once.

StateBackend Impact (Flink)

Memory and FileSystem backends deliver similar throughput; RocksDB achieves roughly one‑tenth of that.

Latency is comparable for Memory and FileSystem; RocksDB shows slightly higher latency, especially on YARN.

Conclusions and Recommendations

Flink outperforms Storm in both throughput (3‑5×) and latency (about half at full load).

When user logic is complex (e.g., 1 ms sleep), the throughput advantage diminishes, though Flink still maintains lower latency.

Exactly‑once semantics incur modest throughput loss for Flink but provide strong delivery guarantees.

Storm’s at‑most‑once improves throughput but still lags behind Flink.

For scenarios requiring exactly‑once delivery, high throughput, low latency, or extensive stateful/windowed processing, Flink is the preferred choice.

Future Work

Investigate the scalability of exactly‑once under higher concurrency.

Determine the range of user‑logic latency where Flink’s advantage remains significant.

Extend evaluation to reliability, scalability, and advanced APIs (Table API, SQL, CEP).

References

Distributed Stream Processing Framework – Function Comparison and Performance Evaluation

intel-hadoop/HiBench: big data benchmark suite

Yahoo’s Streaming Benchmark

Extending the Yahoo! Streaming Benchmark

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Apache FlinkLatencythroughputperformance evaluationExactly-OnceApache StormReal-time Stream Processing
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.