Databases 12 min read

How Hologres Achieves Record‑Breaking Real‑Time Data Warehouse Performance

This article introduces Alibaba Cloud's Hologres real‑time data warehouse, explains its architecture, table and index types, and provides a step‑by‑step guide for setting up instances, importing data, and running OLAP and KV performance tests using TPC‑H benchmarks.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
How Hologres Achieves Record‑Breaking Real‑Time Data Warehouse Performance

Hologres Overview

Hologres is Alibaba Cloud's self‑developed, one‑stop real‑time data warehouse engine that offers unified, elastic, and easy‑to‑use analytics. It breaks the TPC‑H world record and supports multiple scenarios such as OLAP, ad‑hoc queries, point lookups, and vector calculations, replacing traditional OLAP engines and KV databases.

Test Process

The test uses a 96CU instance to run TPC‑H standard queries for OLAP and includes insert and update scenarios for serving.

Instance Creation

Hologres separates storage and compute. Storage uses high‑performance Pangu DFS and supports row, column, and hybrid formats. Compute runs in containerized nodes (e.g., 16CU per container).

Control Panel

After creation, the instance appears in the console where you can view specifications, VPC domain, and monitoring metrics.

Monitoring Metrics

Each instance provides over 15 metrics, including QPS, RPS, latency, binlog, serverless status, lock health, and analyze statistics.

Connecting to the Database

Hologres is PostgreSQL‑compatible; you can connect with psql or other PostgreSQL tools. DataWorks DataStudio is recommended for development and scheduling, while HoloWeb serves as an operations and diagnostics platform.

Table Creation

Hologres supports internal and external tables. Internal tables include column‑store (default for OLAP), row‑store (for KV or Flink dimension tables), and hybrid row‑column tables. External tables can be MaxCompute or OSS tables.

Column‑store table: optimized for OLAP queries.

Row‑store table: for key/value lookups and Flink dimension tables.

Hybrid table: supports both point lookups and OLAP.

External table: MaxCompute or OSS source.

Indexes

Hologres supports various indexes such as Distribution Key, Clustering Key, Bitmap, and Event Time Column. Choosing the right index reduces I/O and improves query speed.

Index

Applicable Scenario

Example Query

Distribution Key

Frequent GROUP BY or join columns

select * from tbl1 join tbl2 on tbl1.a=tbl2.c;

Clustering Key

Range or filter queries (max two columns)

select sum(a) from tb1 where a > 100 and a < 200;

Bitmap

Equality queries

select * from tb1 where a = 100;

Event Time Column

Time‑series data

select sum(a) from tb1 where ts > '2020-01-01' and ts < '2020-03-02';

Data Import Modes

Data can be imported via local file copy, MaxCompute external tables, or other supported methods. The estimated upload time varies between public internet and VPC networks, with VPC offering higher bandwidth.

Test Environment Preparation

Create an ECS instance (e.g., ecs.g6.4xlarge, Alibaba Cloud Linux 3.2104 LTS, ESSD cloud disk) in the same region, VPC, and zone as the Hologres instance. Then create the Hologres instance and a test database.

Testing Tools

Install JDK and the psql client on the ECS instance. Download holo-e2e-performance-tool for KV write/update tests and hologres_benchmark_for_tpch for TPC‑H benchmarks, then transfer them to the ECS.

OLAP Test

Unzip the benchmark package, edit group_vars with VPC endpoint, AK, port, and data directories, then run the TPC‑H script to generate data and create the database. Execute run_tpch.sh query to run random queries; results with query times are saved locally.

KV Query Test

Upload the official tool package, install Java JDK, configure connection details, and use fixed copy to load data. Query results (start/end time, rows, QPS, latency, P99) are stored under the result directory.

Backend Monitoring

During stress testing, monitor metrics such as time and QPS from the backend console.

Conclusion

The demo covered two scenarios: OLAP performance testing and point‑lookup (KV) testing. For more details, visit the Hologres official website.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance TestingHologresOLAPreal-time data warehousecloud database
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.