Databases 7 min read

Alibaba HiStore: Massive Columnar Database for Historical Data Storage and Query

Alibaba's HiStore columnar database, deployed for e‑commerce historical data, processes over 6 trillion records and 5 PB of data daily, offering high compression, low cost, linear scalability, MySQL compatibility, and superior OLAP performance for massive multi‑dimensional queries.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba HiStore: Massive Columnar Database for Historical Data Storage and Query

Alibaba's e‑commerce business stores and queries massive historical data using the HiStore columnar database; on Double 11, HiStore processed more than 6 trillion records and over 5 PB of raw data, making it the world’s largest columnar database by daily volume.

HiStore addresses query‑intensive workloads such as data warehouses and mining, where data is inserted and updated rarely but requires high‑concurrency, multi‑dimensional queries.

Traditional row‑store databases struggle with large volumes and multi‑dimensional query performance, whereas Alibaba’s self‑developed, distributed, low‑cost analytical database HiStore provides high cost‑performance, high compression, massive data processing, and column‑store advantages.

Relying on Alibaba middleware (Aliware) to meet world‑class challenges

HiStore’s architecture uses column‑based storage, column compression, parallel processing, snapshot concurrency control, and intelligent indexing, delivering superior cost, query, statistical, analytical, and bulk‑load performance; it is built by the Aliware team to handle the massive traffic and stability demands of Alibaba’s global e‑commerce platforms.

OLAP scenario HiStore performance outstanding

Compared with other columnar products such as SAP HANA, HP Vertica, Teradata, InfiniDB, MonetDB, and ClickHouse, HiStore offers high‑performance multi‑dimensional queries, multi‑core concurrency, DML support, ALTER TABLE, temporary tables, high availability, heterogeneous data import, fast data loading, advanced compression algorithms, and MVCC.

Key advantages include:

1. Significantly reduced hardware cost: transparent compression achieves average ratios >10:1, up to 40:1 in some cases.

2. Large storage capacity: high‑speed loading (2 TB/hour) and compression >10:1 enable TB‑scale data and billions of records.

3. Support for high concurrency and real‑time multi‑dimensional ad‑hoc queries, delivering second‑level retrieval on massive datasets.

4. Full MySQL compatibility, supporting the MySQL ecosystem’s tools and applications.

5. Linear scalability when combined with TDDL/DRDS, allowing storage and processing capacity to grow proportionally.

6. In OLAP workloads, HiStore’s query performance matches competitors while costing only one‑third of InfiniDB and loading data twice as fast.

High compression + column storage, EagleEye system hardware cost reduced 90%

Alibaba’s internal EagleEye historical data system, processing trillions of records daily and hundreds of TB of data, reduced cluster size by 90% and achieved a 20:1 compression ratio after adopting HiStore, dramatically cutting costs. The security department’s risk control data also benefits from an average 10:1 compression and millisecond‑level multi‑dimensional aggregation.

Real‑time multi‑dimensional query, social security cloud performance excellent

Since February 2016, the Ministry of Human Resources and Social Security’s information center has used HiStore in the LEAF6 cloud platform, handling 50 million citizens, about 800 billion records, and single tables with 33 billion rows; queries involving online grouping and multi‑table joins have shown excellent performance.

Looking forward, HiStore will continue to deepen its high performance, cost‑effectiveness, and high availability, leveraging Alibaba’s extensive internal and external business scenarios, while enhancing its service‑oriented ecosystem and enterprise‑grade control platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AlibabaOLAPclouddata compressionColumnar DatabaseHiStore
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.