Big Data 16 min read

Alibaba's One‑Stop Real‑Time Data Warehouse: Hologres Architecture and CCO Implementation Experience

The article reviews the shift of big‑data computing from batch to real‑time, outlines the evolution of one‑stop real‑time data warehouses, introduces Alibaba's Hologres solution and its technical advantages, and shares the CCO department’s three‑generation architecture upgrades and practical use cases.

DataFunSummit
DataFunSummit
DataFunSummit
Alibaba's One‑Stop Real‑Time Data Warehouse: Hologres Architecture and CCO Implementation Experience

Introduction – Real‑time big data is moving from scale‑out to low‑latency processing, creating many pain points. This write‑up summarizes senior Alibaba expert Jiang Weihua’s DataFunTalk presentation on building a one‑stop real‑time data warehouse with Hologres.

1. Evolution of Real‑time Data Warehouses

Traditional real‑time warehouses fall into two scenarios: OLAP for internal analytics and KV‑based serving for online services. Maintaining separate stacks leads to complexity, data silos, high development cost, and slow business response.

2. Need for Agility in Real‑time Big Data

Business‑self‑service development: low‑code, SQL‑first, visual configuration.

No learning curve: treat the platform like a database, support standard SQL and tools such as Tableau.

Development agility: write‑and‑analyze, store raw data, enable rapid iteration.

All these trends require a powerful real‑time warehouse engine.

3. One‑Stop Real‑time Warehouse Concept

Instead of separating OLAP and serving, Alibaba proposes a unified “Hybrid Serving/Analytics Processing (HSAP)” model, realized by Hologres.

4. Hologres Overview

Hologres is a self‑developed Alibaba product validated in multiple core scenarios (e‑commerce, advertising, logistics, AI客服, etc.). It supports >11 billion writes/sec during Double 11, >2000 QPS OLAP queries, and PB‑level storage.

Combined with Alibaba’s data‑product matrix (DataWorks, MaxCompute, Flink, DLF), Hologres enables real‑time‑offline integration, analysis‑service integration, lake‑warehouse integration, and stream‑batch integration.

5. Architectural Evolution

2020 : Separate row‑store for serving and column‑store for OLAP, requiring duplicate data.

2021 : Row‑column co‑existence (One‑Data, Multi‑Workload) with strong consistency, read‑write separation, high availability.

Beyond 2021 : Real‑time materialized views and SQL‑expressed data pipelines to achieve full “one‑stop” across horizontal (multiple workloads) and vertical (data processing) dimensions.

6. CCO Department Real‑time Warehouse Journey

The CCO (Chief Customer Office) handles customer experience across B2B and B2C scenarios such as客服现场调度, 购物链路预警, and AI智能服务. Their real‑time warehouse processes thousands of Flink jobs, writes >40 million rows/sec, and stores both row and column tables in Hologres.

Three generations of architecture:

1.0 Traditional (2016‑2017): Lambda‑style batch‑pre‑compute written to HBase/MySQL, high complexity.

2.0 Stream‑Batch Integrated (2018‑2020): Layered DWD/DWS/ADS with DataHub, still many data copies and synchronization tasks.

3.0 High‑Availability One‑Stop (2020‑present): Real‑time writes via Flink into Hologres, offline data from MaxCompute also lands in Hologres, providing unified storage for both serving and OLAP. Binlog enables downstream processing.

Key benefits of 3.0 include stream‑batch unification, tight Flink integration, high availability with read‑write isolation, and seamless metadata management.

7. Practical Use Cases

Customer Resource Management : Real‑time dashboards built via BI tools on Hologres, reducing monitoring latency to minutes.

User Voice Insight : Real‑time ingestion of shopping‑chain data, Binlog‑driven secondary processing, supporting >20 BU with QPS‑level analysis.

Intelligent Service (AI客服) : Online query service powered by Hologres with Proxima vector search for knowledge‑base retrieval, dramatically improving response accuracy.

Conclusion – By adopting Alibaba’s one‑stop real‑time data warehouse (Hologres), organizations can eliminate common big‑data pain points, achieve unified storage, and accelerate business growth.

AlibabaData Engineeringbig datastream processingHologresreal-time data warehouse
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.