Databases 16 min read

Applying Apache Doris in Meituan Food Delivery Data Warehouse: Dual Engine Architecture and Performance Optimizations

The article details Meituan's food‑delivery data warehouse transformation from a MOLAP‑centric design to a dual‑engine (MOLAP + ROLAP) architecture powered by Apache Doris, describing the challenges of massive, mutable data, the technical trade‑offs, and the performance gains achieved through MPP, predicate push‑down, multi‑instance concurrency, colocate joins, and bitmap aggregation.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Applying Apache Doris in Meituan Food Delivery Data Warehouse: Dual Engine Architecture and Performance Optimizations

Meituan's food‑delivery data warehouse originally relied on a MOLAP engine (Apache Kylin) for pre‑computed cubes and a traditional RDBMS (MySQL) for detail queries, but faced scalability issues due to rapidly changing business dimensions, massive historical data, and the inability to support fine‑grained detail queries.

To address these challenges, the team introduced a dual‑engine architecture that combines MOLAP for stable, pre‑aggregated workloads and ROLAP powered by Apache Doris for on‑demand, high‑throughput analytics. Doris, an MPP‑based OLAP engine, offers strong parallel processing, low‑latency query execution, and native support for both summary and detail queries.

The article compares the disadvantages of MOLAP (complex model preparation, costly pre‑computation, lack of detail query support) with the advantages of ROLAP (simplified model design, flexible view‑based business logic, support for both aggregated and detailed data, reduced production cost).

After evaluating several open‑source OLAP solutions (Greenplum, Impala, Presto, ClickHouse, Druid, TiDB, etc.), Doris was selected for its MPP architecture, compatibility with the MySQL protocol, and ease of integration into Meituan's technology stack.

Doris’s architecture consists of a Frontend (FE) for query parsing, optimization, and metadata management, and multiple Backends (BE) for query execution and storage. Its key features include high‑concurrency point queries, batch and real‑time data ingestion, support for both aggregation and detail queries, schema evolution, and advanced join strategies.

Performance benchmarks show that a 20‑node Doris cluster (20 BE + 3 FE) can serve dozens of analytical products with millisecond‑level response times, handle multi‑billion‑row joins in seconds, and support daily‑level real‑time calculations while keeping query latency within seconds for most workloads.

For near‑real‑time analytics, the team built a Lambda‑style architecture on Doris, enabling 10‑15 minute data freshness and seamless integration of batch and streaming data, reducing development and operational costs compared to traditional Flink or Storm windowed computations.

Several engine‑level optimizations were implemented:

Join predicate push‑down (constant propagation) to filter tables early, e.g., select * from t1 join t2 on t1.id = t2.id where t1.id = 1 , which reduces scan volume dramatically.

Multi‑instance concurrency on each node, increasing parallelism and achieving 3‑5× query speed‑up.

Colocate Join, which shards data by join key to eliminate network shuffle during joins.

Bitmap aggregation for precise distinct counting, replacing costly group‑by scans with compact bitmap indexes.

These enhancements dramatically lowered I/O, CPU, and memory consumption for high‑cardinality distinct‑count queries, enabling sub‑second responses even on billions of rows.

In conclusion, the Doris‑driven ROLAP mode effectively handles summary‑detail workloads, mutable dimensions, and near‑real‑time requirements, offering a viable alternative to Kylin, Druid, and Elasticsearch for Meituan’s data‑intensive operations.

performance optimizationBig DataData WarehouseApache DorisMOLAPROLAP
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.