Big Data 14 min read

Building a Unified High‑Performance OLAP Platform with DorisDB at Beike Real Estate

The article describes how Beike Real Estate consolidated multiple OLAP engines into a single DorisDB‑based platform, detailing the business challenges, DorisDB’s technical advantages, extensive performance and concurrency benchmarks, and the resulting improvements in stability, query speed, and operational simplicity across various business scenarios.

DataFunTalk
DataFunTalk
DataFunTalk
Building a Unified High‑Performance OLAP Platform with DorisDB at Beike Real Estate

Beike Real Estate, a technology‑driven housing service provider, needed a digital and intelligent data platform to support a wide range of services for hundreds of millions of families. The company's Big Data Platform Department operated six to seven different OLAP engines (Impala, Presto, Kylin, Druid, ClickHouse, Hive) to meet diverse analytical workloads, leading to high operational complexity and steep learning curves for users.

Since 2021, the department introduced DorisDB as the primary analysis engine, aiming to unify the OLAP platform for indicator analysis, ad‑hoc queries, and visual reporting. DorisDB’s MPP architecture, columnar storage, high availability, ANSI‑SQL support, multi‑table join capability, materialized view automation, and real‑time data ingestion allowed the team to replace multiple legacy engines with a single, high‑performance solution.

The article outlines the main business pain points: lack of update support in Druid, costly mutations in ClickHouse, poor multi‑table join performance, inability to handle both detail and aggregate queries simultaneously, and the operational burden of maintaining many engines. DorisDB addresses these issues with efficient update mechanisms, robust join support, and seamless integration with existing MySQL protocols.

Comprehensive benchmark tests were conducted using the Star Schema Benchmark (SSB) on tables with up to 6 billion rows. In single‑node, limited‑thread scenarios, DorisDB outperformed ClickHouse in 9 out of 13 queries; without thread limits, it won 7 out of 13. In multi‑table join tests, DorisDB showed a 5‑10× speed advantage over Apache Doris. High‑concurrency stress tests demonstrated DorisDB achieving 1500‑2000 QPS with average latency around 50 ms, compared to Druid’s 600‑700 QPS and 100 ms latency.

Deployment details reveal a 35‑node DorisDB cluster (80 CPU cores, 192 GB RAM, 3 TB SSD per node) with 35 BE and 3 FE instances, supporting indicator platforms, visual reporting, and various business applications. Specific benefits include high‑QPS metric queries for real‑time performance assessments, automatically refreshed materialized views, flexible data models supporting both wide‑table and join modes, and smooth migration of MySQL‑based reports to DorisDB.

Use cases such as A/B testing, transaction processing, risk control, live streaming platforms, and user behavior analysis have been migrated from ClickHouse or Apache Doris to DorisDB, achieving query latency reductions of up to sevenfold. The team reports that DorisDB surpasses Apache Doris in stability and performance, reduces operational overhead, and offers promising integration with Hive external tables, Spark SQL, Presto, and future ElasticSearch queries.

In conclusion, the DorisDB‑driven OLAP platform provides a unified, high‑performance, and low‑maintenance solution that meets the diverse analytical needs of Beike’s business while significantly improving query efficiency and operational simplicity.

Performanceanalyticsbig datadata platformOLAPbenchmarkDorisDB
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.