Evolution and Construction of Huolala's Doris‑Based OLAP System
This article details Huolala's journey from a MySQL‑centric analytics pipeline to a multi‑engine OLAP platform built on Doris, covering system architecture, data flow, stage‑wise evolution, engine selection, POC validation, performance tuning, stability measures, and future roadmap for self‑service analytics.
Huolala, founded in 2013, operates in over 352 Chinese cities with more than 580,000 active drivers and 7.6 million monthly users; its data platform now runs on thousands of machines, storing over 20 PB and processing 20k+ daily tasks.
The data architecture is organized into five layers: a foundational and ingestion layer for storage and compute, a platform layer (data‑development and governance), a data‑warehouse layer, and service/application layers that expose business‑oriented APIs.
The end‑to‑end data flow includes real‑time and batch ingestion, storage, computation, and data services, supporting real‑time, offline, and online scenarios via Flink, MySQL, HBase, and OLAP engines.
Since 2021, Huolala has iterated its OLAP system through three phases. Phase 1.0 (the “incubation” stage) relied on MySQL for aggregated results, exposing limitations such as storage bottlenecks, high development cost, and insufficient dimensional analysis.
Phase 2.0 introduced a “refinement” stage, selecting Druid after evaluating Druid, ClickHouse, Kylin, Presto, and Doris; Druid was chosen for its Java‑centric implementation and scalability. A thorough POC covered functional, performance, and data‑quality verification, leading to stability enhancements (segment tuning, materialized views, flame‑graph analysis).
Phase 3.0 addressed growing multi‑source join requirements. After comparing Druid, ClickHouse, and Doris, Doris was adopted for its low external dependencies, native FE/BE architecture, and efficient horizontal scaling. Additional components such as ClickHouse were retained for complex data types.
Stability assurance follows a three‑stage approach (pre‑, during‑, post‑operation) with capacity planning, compaction monitoring, and incident post‑mortems. Parameter optimizations (e.g., max_running_txn_num_per_db, exec_mem_limit) and patches (e.g., StringLast functions) were applied.
The roadmap envisions an OLAP‑platform that offers self‑service modeling, multi‑engine routing, and continued migration toward Doris as the primary engine while maintaining ClickHouse for specialized workloads.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.