Design and Implementation of a Real-Time Data Processing System at Meituan
Meituan designed a Storm‑based real‑time data processing platform that guarantees at‑least‑once delivery and high availability, employs a custom spout, regression‑driven traffic smoothing, and a low‑latency KV store with atomic operations, persisting results in Kafka, MySQL and Cellar to power merchant dashboards and heat‑tag analytics, while planning broader real‑time analytics expansion.
Meituan has accumulated massive online transaction and user behavior data. To empower merchants, a stronger real‑time data processing capability is required, moving beyond the traditional offline "T+1" pipeline.
The article introduces the design of a real‑time data processing system, comparing three popular streaming engines—Storm, Spark Streaming, and Flink—and explains why Storm was chosen for its at‑least‑once semantics and high availability.
Key challenges include unstable data volume, upstream data quality uncertainty, efficient data placement for computation, correctness in multi‑threaded processing, and delivery of computed results to applications.
Implementation details :
1. Data ingestion integrity : Storm’s at‑least‑once guarantee ensures completeness. A custom Spout is used, and duplicate data is filtered manually.
2. Real‑time data smoothing : A multivariate linear regression model predicts the next minute’s data volume, reducing cluster pressure by about 33%.
public void doSomeWork(String input) {
cellar.mapPut("uniq_ID");
cellar.add("uniq_ID_1","some data");
cellar.add("uniq_ID_2","some data again");
...
cellar.mapRemove("uniq_ID");
} public void remedySomething() {
map = cellar.mapGetAll();
version = cellar.mapGet("uniq_ID").getVersion();
for (string str : map) {
if (cellar.get(str + "_1").getVersion()!= version) {
cellar.add(str + "_1", "some data");
cellar.mapRemove(str);
}
...
}
}3. Computation strategy : Uses a distributed KV store (Cellar) that provides near‑zero latency I/O and supports atomic operations. Distributed locks (setNx) and version mechanisms guarantee correctness under concurrent updates and system restarts.
4. Storage layers : Kafka stores slightly processed detail data, MySQL holds intermediate results for visualization, and Cellar stores final results for direct query by applications.
Application cases :
• Real‑time business data cards in Meituan’s “Open Store” product, helping merchants make timely decisions.
• Real‑time heat‑tag labels for Meituan‑Dianping financial partner stores, enhancing marketing visibility.
Conclusion and outlook : The system’s framework has been partially deployed, but further work is needed to achieve comprehensive real‑time analytics. Future efforts will focus on expanding to data dashboards, user behavior analysis, and marketing effect tracking, leveraging the 4V+1O characteristics of big data.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
