Building an Integrated Metric Data Service Platform with Apache Doris: Architecture Evolution and Millisecond‑Level Query Performance
This article describes how Financial One Account, a technology service arm of Ping An, migrated from a Hadoop‑Presto‑Kylin stack to an Apache Doris‑based data platform, detailing the architectural evolution, OLAP engine selection, metric system design, performance optimizations, and future roadmap for real‑time analytics.
Financial One Account, a technology‑as‑a‑service provider of the Ping An Group, faced inconsistent metric definitions, duplicated calculations, and low delivery efficiency in traditional reporting. To address these pain points, it built an integrated metric data service platform based on Apache Doris, aiming to centralize metric construction, reduce ETL effort, and achieve millisecond‑level query responses.
The first‑generation architecture (Hadoop + Presto + Apache Kylin) relied on pre‑computed cubes, which proved insufficient for flexible analysis, high‑performance queries, and low operational cost. After evaluating several OLAP engines, Apache Doris was selected for its MySQL compatibility, distributed join capabilities, materialized view support, and simple two‑role (FE/BE) deployment.
In the second‑generation architecture, Doris replaced Kylin and Presto, using a Duplicate‑Key model with range‑partitioned tables to store detailed data. The MPP engine enabled high concurrency and low‑latency processing, supporting complex multi‑table joins and achieving query times as low as 63 ms in large‑scale scenarios.
Metric construction follows a three‑tier model: atomic metrics (raw fields), derived metrics (e.g., YoY, MoM), and derived metrics (custom calculations). The platform provides metadata management, dimension selection, and publishing workflows, complemented by AI‑driven anomaly detection, root‑cause analysis, and value‑scoring for metric governance.
Performance tuning involved data preparation (stream load of CSV files), cluster monitoring with Prometheus and Grafana, and two optimization schemes: colocation join and wide‑table redesign. The wide‑table approach, combined with SQL caching, reduced most query latencies to single‑digit milliseconds, achieved 300 TPS under concurrency, and saved over 30 % of ETL development effort.
Future work includes real‑time analysis with Flink CDC and Apache Iceberg, materialized view enhancements for multi‑table joins, and migration of other middle‑platform products (e.g., label platform) to Doris.
Architecture 1.0: Hadoop + Presto + Apache Kylin – limited flexibility, performance, and high maintenance cost.
Architecture 2.0: Apache Doris – high‑performance OLAP, simple deployment, and low operational overhead.
Key benefits: millisecond‑level query response, reduced ETL workload, unified metric governance, and scalable real‑time analytics.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.