What Is HTAP? Exploring Architecture Types, Key Technologies, and China’s Leading Databases
This article provides a comprehensive overview of Hybrid Transactional/Analytical Processing (HTAP) databases, detailing the four architectural patterns, five core technologies, and a comparative analysis of major Chinese HTAP products such as TiDB, OceanBase, PolarDB‑X, TDSQL, GaussDB and GreatSQL, while also discussing their characteristics and remaining challenges.
HTAP Definition
Hybrid Transactional/Analytical Processing (HTAP) is an emerging database architecture that removes the wall between OLTP (online transaction processing) and OLAP (online analytical processing), enabling real‑time decision making across transactional and analytical workloads.
Gartner introduced the term in 2014, describing HTAP as a unified system that supports both OLTP and OLAP workloads simultaneously.
Traditional OLTP vs OLAP
OLTP systems handle high‑concurrency, low‑latency transactions on small data volumes (e.g., financial transactions), while OLAP systems perform large‑scale analytical queries on massive datasets with lower concurrency. Historically, these workloads run on separate databases connected by ETL pipelines, leading to data freshness and operational complexity issues.
HTAP Architecture Types
Four primary architectural patterns have been identified in the literature:
Primary Row Store + InMemory Column Store
Distributed Row Store + Column Store Replica
Disk Row Store + Distributed Column Store
Primary Column Store + Delta Row StoreExamples:
Primary Row + InMemory Column : Oracle’s in‑memory dual‑format database and SQL Server’s Hekaton engine combine row buffers with columnar compression to serve both workloads in memory.
Distributed Row + Column Replica : TiDB uses a Raft‑based row store (TiKV) and a columnar replica (TiFlash) with asynchronous log replication.
Disk Row + Distributed Column : MySQL HeatWave couples a traditional row‑store MySQL instance with a distributed columnar cluster (HeatWave) for real‑time analytics.
Primary Column + Delta Row : SAP HANA stores the master dataset in a column store and appends updates to a delta row store, periodically merging them.
Key HTAP Technologies
Five technology categories underpin HTAP implementations:
Transaction Processing
Analytical Processing
Data Synchronization
Query Optimization
Resource SchedulingTransaction Processing includes MVCC‑based logging, two‑phase commit (2PC), and Raft‑based distributed transaction protocols. Products such as Oracle, SQL Server, SAP HANA, TiDB and F1 Lightning illustrate these approaches.
Analytical Processing relies on columnar storage, SIMD‑accelerated aggregation, and three scan strategies: (1) memory‑incremental + independent column scan, (2) log‑based incremental + distributed column scan, and (3) pure column scan.
Data Synchronization techniques ensure that incremental updates become visible to the analytical engine: in‑memory delta merge, disk‑based delta merge, and rebuild from the primary row store.
Query Optimization covers memory column selection, hybrid row/column scanning, and CPU/GPU acceleration for mixed workloads.
Resource Scheduling balances OLTP and OLAP resource usage via workload‑aware thread allocation or freshness‑aware mode switching.
Domestic HTAP Products Overview
TiDB – A distributed HTAP database from PingCAP. Row store TiKV and column store TiFlash replicate via Raft. HTAP capability introduced in V1.0 (TiSpark) and fully realized in V3.1 (TiFlash).
OceanBase – Alibaba’s HTAP system since version 3.2 (2021) with 15.26 M QphH@30 000 GB TPC‑H record. V4.3 (2024) adds native column store, materialized views, and data‑import features.
PolarDB‑X – Alibaba Cloud’s cloud‑native HTAP product. Introduced an HTAP‑aware CBO optimizer in 2021 and a Columnar engine with clustered columnar index (CCI) in version 2.4.0 (2024), enabling simultaneous row‑store and column‑store access.
TDSQL – Tencent Cloud’s HTAP offering. Uses an MPP‑based columnar engine built on an LSM‑Tree structure, providing high compression and append‑only updates for low‑latency writes.
GaussDB for MySQL – Huawei’s cloud‑native HTAP database built on a ClickHouse‑derived columnar engine with SIMD acceleration. Transaction logs from the primary node are synchronized to HTAP nodes.
GreatSQL – A MySQL‑compatible open‑source database that adds a secondary “Rapid” engine for analytical processing, leveraging InnoDB for TP and Rapid for AP, with optional master‑slave or MGR replication for read/write separation.
Other notable Chinese HTAP products include StoneDB, AntDB, KunDB, GoldenDB, and DM8 (mentioned briefly).
Characteristics and Challenges
Most HTAP products evolve from OLTP‑first designs; the common architectural pattern combines row and column stores.
Resource isolation varies: OceanBase mixes storage internally, while most others keep row and column stores physically separate.
Data freshness trade‑offs differ: TiDB relies on Raft‑based log replication, TDSQL/PolarDB‑X/GreatSQL use log‑based async replication, whereas Oracle, SQL Server, and SAP HANA merge in‑memory deltas into the column store.
Key challenges include transparent routing of TP vs. AP workloads, achieving strong data freshness without sacrificing latency, and minimizing interference between OLTP and OLAP workloads.
References
https://en.wikipedia.org/wiki/Hybrid_transactional/analytical_processing
https://dl.acm.org/doi/abs/10.1145/3514221.3522565
https://docs.pingcap.com/zh/tidb/stable/release-1.0-ga
https://docs.pingcap.com/zh/tidb/stable/release-3.1.0-beta.2
https://www.vldb.org/pvldb/vol13/p3072-huang.pdf
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
