Big Data 10 min read

Design and Implementation of a Real-Time Data Processing System at Meituan

Meituan designed a Storm‑based real‑time data processing platform that guarantees at‑least‑once delivery and high availability, employs a custom spout, regression‑driven traffic smoothing, and a low‑latency KV store with atomic operations, persisting results in Kafka, MySQL and Cellar to power merchant dashboards and heat‑tag analytics, while planning broader real‑time analytics expansion.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Design and Implementation of a Real-Time Data Processing System at Meituan

Meituan has accumulated massive online transaction and user behavior data. To empower merchants, a stronger real‑time data processing capability is required, moving beyond the traditional offline "T+1" pipeline.

The article introduces the design of a real‑time data processing system, comparing three popular streaming engines—Storm, Spark Streaming, and Flink—and explains why Storm was chosen for its at‑least‑once semantics and high availability.

Key challenges include unstable data volume, upstream data quality uncertainty, efficient data placement for computation, correctness in multi‑threaded processing, and delivery of computed results to applications.

Implementation details :

1. Data ingestion integrity : Storm’s at‑least‑once guarantee ensures completeness. A custom Spout is used, and duplicate data is filtered manually.

2. Real‑time data smoothing : A multivariate linear regression model predicts the next minute’s data volume, reducing cluster pressure by about 33%.

public void doSomeWork(String input) {
    cellar.mapPut("uniq_ID");
    cellar.add("uniq_ID_1","some data");
    cellar.add("uniq_ID_2","some data again");
    ...
    cellar.mapRemove("uniq_ID");
}
public void remedySomething() {
    map = cellar.mapGetAll();
    version = cellar.mapGet("uniq_ID").getVersion();
    for (string str : map) {
        if (cellar.get(str + "_1").getVersion()!= version) {
            cellar.add(str + "_1", "some data");
            cellar.mapRemove(str);
        }
        ...
    }
}

3. Computation strategy : Uses a distributed KV store (Cellar) that provides near‑zero latency I/O and supports atomic operations. Distributed locks (setNx) and version mechanisms guarantee correctness under concurrent updates and system restarts.

4. Storage layers : Kafka stores slightly processed detail data, MySQL holds intermediate results for visualization, and Cellar stores final results for direct query by applications.

Application cases :

• Real‑time business data cards in Meituan’s “Open Store” product, helping merchants make timely decisions.

• Real‑time heat‑tag labels for Meituan‑Dianping financial partner stores, enhancing marketing visibility.

Conclusion and outlook : The system’s framework has been partially deployed, but further work is needed to achieve comprehensive real‑time analytics. Future efforts will focus on expanding to data dashboards, user behavior analysis, and marketing effect tracking, leveraging the 4V+1O characteristics of big data.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

data engineeringDistributed SystemsBig Datastream processingreal-time dataStorm
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.