Databases 21 min read

How AnalyticDB MySQL 3.0 Shattered TPC‑DS Records and Redefined Cloud‑Native Data Warehousing

This article provides a comprehensive analysis of Alibaba Cloud's AnalyticDB MySQL 3.0, detailing its cloud‑native architecture, storage and query innovations, the record‑breaking TPC‑DS benchmark results, and future directions for large‑scale, cost‑effective data warehousing.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How AnalyticDB MySQL 3.0 Shattered TPC‑DS Records and Redefined Cloud‑Native Data Warehousing

1 AnalyticDB Overview

AnalyticDB (also known as ADB) is Alibaba's self‑developed, PB‑level real‑time data warehouse that has been iterated nearly a hundred times since its first release in 2012 and has been offered as a cloud service since 2014, serving e‑commerce, advertising, logistics, entertainment, tourism, risk control and many other online analytical workloads.

2 TPC‑DS Benchmark Introduction

The Transaction Processing Performance Council (TPC) defines the TPC‑DS benchmark to evaluate data‑warehouse performance, covering data loading, single‑ and multi‑concurrent query performance, complex SQL (star and snowflake schemas, window functions), and availability aspects such as data consistency and fault tolerance. It is the most rigorous global metric for data‑warehouse maturity.

AnalyticDB MySQL 3.0 participated in the TPC‑DS test, achieving a 29% performance improvement over the previous world record while costing only one‑third of that price, thereby becoming the globally leading data warehouse.

3 AnalyticDB MySQL 3.0 Technical Architecture

The system follows a cloud‑native design with compute‑storage separation and hot‑cold data segregation, supporting high‑throughput real‑time writes and strong consistency, as well as mixed workloads of high‑concurrency queries and large‑scale batch processing.

It consists of three layers:

Access Layer: Multi‑master coordinator nodes handle protocol access, SQL parsing and optimization, sharding for real‑time writes, and query scheduling.

Compute Engine: A distributed MPP + DAG execution engine with an intelligent optimizer provides high‑concurrency and complex SQL support, leveraging elastic cloud resources for minute‑level scaling.

Storage Engine: A Raft‑based distributed strong‑consistent storage engine uses data sharding, Multi‑Raft parallelism, tiered storage for hot‑cold separation, and row‑column hybrid storage with smart indexing.

AnalyticDB architecture diagram
AnalyticDB architecture diagram

4 AnalyticDB Storage Technology

4.1 Distributed Strong‑Consistent Storage

AnalyticDB MySQL 3.0 implements a lightweight Raft‑based storage layer that delivers high‑throughput real‑time writes, outperforming open‑source solutions such as HBase, Kudu, Elasticsearch, and ClickHouse in both analytical performance and ACID guarantees.

Storage architecture diagram
Storage architecture diagram

4.2 High‑Performance Bulk Import

AnalyticDB adopts a lightweight "build" process that converts real‑time data into full‑partition data using an in‑memory single‑copy local build, drastically reducing DFS read/write overhead. Additional optimizations include DirectIO, binary streaming, asynchronous pipelines, zero‑copy transfers, and LZ4 compression, achieving over 50 million rows/second on 18 nodes.

Bulk import performance chart
Bulk import performance chart

4.3 High‑Throughput Real‑Time DML

Built on Raft, AnalyticDB supports million‑level TPS real‑time updates with linear consistency, leveraging asynchronous pipelines, zero‑copy, and efficient encoding. The storage engine combines Delta (real‑time) and Main (partitioned) data with MVCC and snapshot isolation, ensuring ACID properties even under node failures.

4.4 Row‑Column Hybrid Storage and Smart Indexes

The proprietary row‑column hybrid format stores each table in a single file divided into RowGroups and column Blocks, enabling efficient random reads and vectorized scans. Smart indexes (invert, bitmap, KD‑Tree, JSON, vector) are automatically created and dynamically pushed down during query execution.

Hybrid storage and index diagram
Hybrid storage and index diagram

5 AnalyticDB Query Technology

The query engine consists of a cost‑based optimizer (CBO) and an execution engine. The optimizer uses a Cascades‑based search framework, distributed parallel planning, accurate cost estimation, and comprehensive statistics collection to generate optimal plans for complex analytical workloads.

The execution engine combines Just‑In‑Time (JIT) compilation and vectorization, offering a hybrid model that adapts to CPU‑cache‑friendly or memory‑intensive tasks. Unified memory management, binary‑type storage, layered memory pools, and leak detection ensure efficient resource usage.

Advanced techniques such as Dynamic Filter Push‑Down (DFP) and Common Table Expression (CTE) optimization further reduce data scanning and eliminate redundant computations.

CBO optimizer diagram
CBO optimizer diagram

6 Summary and Outlook

AnalyticDB has been validated by top‑tier research (VLDB paper), world‑leading TPC‑DS results, and extensive production use across Alibaba and external enterprises. With its cloud‑native design, it bridges the gap between databases and big data, and the record‑breaking TPC‑DS performance is only the beginning of its journey toward becoming the foundational infrastructure for digital transformation and online data value.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data WarehouseDatabase ArchitectureAnalyticDBTPC-DS
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.