How ADB PG’s Multi‑Master Architecture Boosts HTAP Performance
This article explains the HTAP concept, describes AnalyticDB PostgreSQL’s shift from a single‑master to a Multi‑Master design, details the architecture, distributed transaction handling, global deadlock detection, DDL support, fault tolerance, and presents benchmark results that demonstrate linear scalability for real‑time data‑warehouse workloads.
Introduction
Hybrid Transactional/Analytical Processing (HTAP) was defined by Gartner in 2014 as a class of database systems that combine OLTP (transaction) and OLAP (analytical) capabilities. Traditional deployments separate OLTP (e.g., MySQL, PostgreSQL, PolarDB) and OLAP (e.g., ClickHouse, AnalyticDB) workloads, requiring data replication and ETL pipelines.
HTAP Benefits and ADB PG
HTAP reduces system cost by handling both workloads in a single platform, provides full ACID guarantees, and serves as an effective ETL engine. Cloud‑native, storage‑compute separation, and HTAP are recognized as key evolution directions for modern databases.
AnalyticDB PostgreSQL (ADB PG) is Alibaba Cloud’s real‑time data warehouse that supports standard SQL, is compatible with PostgreSQL/Greenplum and Oracle syntax, and is certified for both distributed analytical and transactional workloads. Since version 6.0 it introduced a Multi‑Master architecture to overcome the performance bottleneck of a single master.
Single‑Master vs Multi‑Master
In a traditional single‑master design, one main master handles client requests, query planning, and transaction coordination, while segment nodes perform computation and storage. This creates CPU/memory contention on the master for high‑concurrency OLTP and HTAP workloads.
The Multi‑Master design adds secondary master nodes that can process DDL/DML like the main master, enabling scale‑out of master resources and preserving high availability through standby masters.
Multi‑Master Architecture
The architecture introduces Main Master, Standby Master, Secondary Masters, GTM (Global Transaction Manager), FTS (Fault‑Tolerance Service), and Catalog components. Load balancing is performed via SLB weight‑based routing.
Main Master : request handling, query optimization, task dispatch, global metadata, and transaction management.
GTM : maintains global transaction IDs and snapshots for distributed transactions.
FTS : monitors node health and performs failover.
Catalog : stores global metadata.
Standby Master : high‑availability replica of the main master.
Secondary Master : “weakened” master that forwards requests to the main master via GTM proxy.
Key Technologies
Distributed Transaction Management
ADB PG uses a two‑phase commit (2PC) protocol with distributed snapshots. The main master initiates transactions, assigns GXIDs, and coordinates prepare/commit phases across segments. Optimizations include 1PC for single‑segment transactions and a mapping cache for local‑to‑global transaction IDs.
GTM Interaction Protocol
A set of messages (GET_GXID, SNAPSHOT_GET, TXN_BEGIN, TXN_PREPARE, TXN_COMMIT, etc.) enable secondary masters to obtain global transaction IDs and snapshots. A session‑consistency mode reduces message overhead by using GET_GXID_MULTI for batch GXID allocation.
GTM Proxy
GTM Proxy runs as a Postmaster sub‑process, aggregates GTM requests from multiple backends, shares snapshots, batches GXID requests, and reduces network overhead.
Transaction Recovery
Recovery involves replaying xlog on the main master to find prepared transactions, instructing segments to commit or abort, and handling secondary‑master‑initiated transactions by distinguishing GXID‑MasterID pairs.
Global Deadlock Detection
A GDD process on the main master periodically collects lock‑wait information from all segments, builds a global wait‑graph, and aborts a victim transaction to break cycles. The same mechanism is extended to secondary masters.
DDL Support
DDL operations are coordinated via 2PC across masters and segments. Secondary masters forward DDL requests to the main master, ensuring catalog synchronization.
Distributed Table Locks
A unified lock protocol defines lock acquisition order across masters and segments, supporting 1‑8 level lock hierarchies to maintain compatibility with PostgreSQL semantics.
Cluster Fault Tolerance and High Availability
Replication (master‑to‑standby, segment‑to‑mirror) and FTS probes monitor node health, perform failover, and keep configuration consistent across masters.
Performance Evaluation
Benchmarks on Alibaba Cloud ECS instances show linear scalability of HTAP workloads when multiple masters are deployed. In TPC‑C and TPC‑B tests, a single master saturates around 64 concurrent connections, while four masters continue to scale linearly with increasing concurrency.
Conclusion
The Multi‑Master extension of ADB PG breaks the single‑master bottleneck, improves connection capacity and read/write performance, and better serves real‑time data‑warehouse and HTAP scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
