How Qunar Built a Scalable BI Platform for Real‑Time Analytics and Self‑Service Reporting
This article details Qunar's multi‑year journey of designing and evolving a full‑stack BI platform—covering data ingestion, storage, query engines, self‑service analytics, and real‑time OLAP—by iterating through three development phases, selecting technologies such as Impala, Kudu, ClickHouse and Apache Druid, and addressing performance, usability and governance challenges to empower business users with fast, reliable data insights.
Background
Rapid growth of Qunar’s业务 required a BI platform that supports drag‑and‑drop reporting, ad‑hoc analysis, sub‑second query response, and trustworthy metrics.
Evolution Stages
Original stage (pre‑2016) – a monolithic end‑to‑end reporting system built by data developers.
Development stage (2016‑2018) – configurable reporting (V2), self‑service analysis, and an OLAP layer.
Systematic stage (2019‑present) – on‑the‑fly queries, self‑service email reports, third‑generation data‑report module (V3), and comprehensive governance.
Stage 1: Original
Data was extracted from logs using Hive, transformed via ETL, and loaded into MySQL. Backend services queried MySQL directly and custom front‑end pages rendered charts. This architecture suffered from low efficiency, inconsistent code quality, duplicated effort, and poor scalability.
Stage 2: Development (2016‑2018)
Key improvements:
Data developers exported ADS‑layer tables to PostgreSQL to leverage its rich analytical functions.
Self‑service analysis allowed product users to configure dimensions, metrics, and filters without writing SQL.
Real‑time pipelines used Kafka + Flink to write hot data to Kudu and cold data to HDFS (Parquet). Impala provided a unified query layer over both stores.
To support both offline and real‑time queries, a hybrid storage architecture was adopted: Impala+Kudu for hot data and Impala+Parquet for offline data.
Stage 3: Systematic (2019‑present)
Major components introduced:
On‑the‑fly query & email report module : Users submit SQL, which is syntax‑checked, permission‑validated, and executed via JDBC. Results can be previewed, downloaded, or emailed.
Data‑report module (V3) : Componentized chart library, low‑code drag‑and‑drop configuration stored as JSON, and a unified permission model per business unit (BU).
Real‑time OLAP : Supports hundreds of dimensions and metrics on billions of rows with sub‑second latency. After evaluating Druid, Kylin, Presto, Elasticsearch, and Impala, ClickHouse was selected for its high‑throughput query performance.
Data ingestion uses Waterdrop to load offline Hive data into ClickHouse and a ClickHouse Kafka Engine for real‑time streams. Query flow: user request → SQL parsing → ClickHouse execution → front‑end visualization.
Architecture Overview
Data source layer : MySQL, offline warehouses, metric system, real‑time Kafka streams.
Data ingestion layer : Waterdrop and custom pipelines import data into PostgreSQL, ClickHouse, or Druid.
Storage/engine layer : PostgreSQL/GP for moderate data, ClickHouse for high‑volume real‑time analytics, Druid for pre‑aggregated queries.
Data model layer : Defines dimensions and metrics based on business requirements.
Presentation layer : Visual charts, dashboards, and self‑service drag‑and‑drop configuration.
System management : Unified permission system, task scheduling, performance monitoring, and usage tracking.
Key Features
Multi‑metric calculations (e.g., deriving per‑user page views).
Integrated monitoring & alerting with QTalk and WeChat.
Data lineage (血缘) visibility for each chart and metric.
Performance Benchmark
Benchmark tests on identical datasets showed ClickHouse delivering the lowest query latency, comfortably meeting the 3‑second response requirement for OLAP queries involving hundreds of dimensions and billions of rows.
Future Plans
Mobile BI clients integrated with the company’s IM tool for subscription, interactive analysis, and alerting.
Further abstraction of platform layers to reduce maintenance overhead.
Expansion of analytical scenarios such as retention, attribution, distribution, and user‑path analysis.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
