Migrating Self-Built CDH to Alibaba Cloud: Leveraging Hologres for Unified OLAP Analysis
This case study describes how Noah Holdings migrated its self‑built CDH data platform to Alibaba Cloud, replacing Impala with Hologres to achieve faster, lower‑cost, and unified OLAP analytics that support real‑time business insights and improve overall data operations.
Noah Holdings, a Chinese wealth management firm, faced rapid data growth and diverse data service needs, using multiple databases and data‑warehouse technologies such as MySQL, Impala, Greenplum, and Elasticsearch. To reduce operational costs and improve performance, the company began migrating its self‑built CDH environment to Alibaba Cloud in early 2021, introducing Hologres to replace the core Impala OLAP component.
The original architecture relied on CDH with Impala for ad‑hoc queries, Hive for modeling, and DataX/Sqoop for data ingestion, while real‑time metrics were built with Debezium, Flink, and Kafka feeding MySQL. This setup suffered from slow query response, high maintenance overhead, fragmented query engines, and scalability challenges.
Four evaluation dimensions—standard SQL support, high‑concurrency querying, operational management, and performance—were used to compare Hologres, StarRocks, and ClickHouse. Hologres offered full SQL compatibility, strong high‑concurrency handling, comprehensive dashboards, and flexible storage modes, making it the preferred choice.
The solution involved moving the data platform to Alibaba Cloud’s unified big‑data services. Offline data is processed in MaxCompute via DataWorks, while real‑time streams from Kafka are ingested into Hologres using Flink. Hologres external tables also sync MaxCompute data, creating a single OLAP engine.
Post‑migration benefits include reduced infrastructure costs, simplified architecture, halved compute resources, clearer DAG‑based task dependencies, and dramatically faster query performance (from >5 seconds to ~300 ms). Real‑time advertising analytics, user‑profile analysis with Roaring Bitmap, and API response times improved significantly, delivering a more agile and cost‑effective data platform for the financial business.
Big Data Technology Architecture
Exploring Open Source Big Data and AI Technologies
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.