Databases 26 min read

From Middleware to Distributed Database: The Evolution of PolarDB‑X

PolarDB‑X evolved from Alibaba’s internal sharding middleware TDDL to the DRDS cloud service and finally to a fully distributed MySQL‑compatible database, introducing a full SQL optimizer, MPP engine, global secondary indexes, strong‑consistent distributed transactions via TSO + 2PC, and full binlog compatibility.

dbaplus Community
dbaplus Community
dbaplus Community
From Middleware to Distributed Database: The Evolution of PolarDB‑X

Background and Early Middleware (2012‑2019)

PolarDB‑X originated from Alibaba’s internal sharding middleware TDDL (2007) and later DRDS (2012‑2019), which combined sharding with a MySQL proxy. The early design aimed to overcome RDS MySQL instance storage limits, lack of write scalability, and cumbersome manual scaling.

RDS MySQL single‑instance storage was limited (early versions only 2 TB).

Share‑Storage solved disk capacity but could not break CPU/memory limits, leaving write scalability unsolved.

Open‑source middleware could address the above issues but made scaling and operations extremely complex.

To meet these demands, a MySQL proxy based on Cobar’s network layer was added on top of TDDL, creating the first DRDS service deployed on Alibaba Cloud.

Cloud‑Native Adoption

DRDS was deployed as a regular Alibaba Cloud user, holding an Alibaba Cloud account with AK/SK to call OpenAPI for resources such as ECS, SLB, and SLS. This allowed the service to be managed like any other cloud resource, freeing the development team to focus on product capabilities.

Technical capabilities accumulated during this period include:

Full compatibility with every MySQL built‑in function (see

https://github.com/ApsaraDB/galaxysql/tree/main/polardbx-optimizer/src/main/java/com/alibaba/polardbx/optimizer/core/function

).

Support for MySQL charset and collation systems, e.g., utf8mb4_general_ci implementation (

https://github.com/ApsaraDB/galaxysql/blob/main/polardbx-common/src/main/java/com/alibaba/polardbx/common/collation/Utf8mb4GeneralCiCollationHandler.java

).

Extensive work on type system, sql_mode, time‑zone handling, default values, etc., all carried forward to PolarDB‑X.

SQL Compatibility and Optimizer

Unlike the original TDDL, which merely forwarded simple SQL, DRDS needed a complete SQL engine. Two key components were added:

A full‑featured optimizer with a rich operator set capable of understanding complex SQL semantics.

An execution engine that can correctly execute the chosen plan.

The optimizer must enumerate possible execution plans and select the lowest‑cost one, even when global indexes dramatically increase the plan space.

Push‑Down Optimization

The guiding principle is to push computation as close to the data as possible. MySQL itself already provides strong computation capabilities (joins, sub‑queries, aggregations). By pushing these operations down to each shard, DRDS achieves higher performance than KV‑based databases that only support filter push‑down.

A simple comparison table (image) shows how excessive push‑down can cause incorrect results, while insufficient push‑down hurts performance. DRDS’s optimizer accumulates many push‑down strategies derived from real‑world cases.

Physical Operators and MPP Engine

DRDS supports a variety of join algorithms, including HybridHashJoin, LookupJoin, NestedLoopJoin, SortMergeJoin, and MaterializedSemiJoin. Initially single‑threaded, the engine evolved to a parallel SMP engine and finally to a full MPP engine. Spill‑to‑disk support enables running TPCH 1 GB on a node with only 15 MB of memory ( https://zhuanlan.zhihu.com/p/363435372).

Distributed Transaction Challenges

Providing strong consistency without invasive MySQL modifications required a global MVCC (TSO) combined with 2PC (XA). Early attempts using flexible transactions (GTS/TXC), GTM, and native XA showed serious limitations such as poor performance, weak isolation, and incompatibility with MySQL semantics.

Introduce a global timestamp generator (TSO) – https://zhuanlan.zhihu.com/p/360160666.

Replace MySQL’s local trx_id with the global timestamp.

Add a commit_timestamp (also from TSO) to determine visibility efficiently.

The transaction flow is illustrated below:

InnoDB record format modifications (the “Lizard” transaction system) are described at https://developer.aliyun.com/article/795058.

Global Secondary Indexes

To hide the partition‑key concept, PolarDB‑X implements global secondary indexes that are strong‑consistent and behave like MySQL’s local indexes. An INSERT writes to both the primary table and the global index within a distributed transaction.

INSERT INTO t1 (id, name, addr) VALUES (1, 'meng', 'hz');

Creating a global index uses the same DDL syntax as MySQL and is performed online:

CREATE INDEX idx_seller_id ON orders (seller_id);

Because every index is global, applications no longer need to consider partition keys when creating tables.

Transparent vs Manual Distribution

Transparent mode makes all indexes global, allowing users to ignore partition keys entirely. Manual mode lets users define partition keys for performance‑critical tables. PolarDB‑X uniquely supports both, recommending transparent mode for most workloads and manual tuning for hot paths.

Typical transparent databases (TiDB, CockroachDB) offer low migration cost but may struggle with heavy distributed‑transaction workloads. Manual databases (OceanBase, YugabyteDB) achieve optimal performance with well‑designed partition keys but raise the design burden. PolarDB‑X aims to combine the best of both worlds.

Consensus Protocol and Binlog Compatibility

PolarDB‑X adopts Alibaba’s X‑Paxos, a production‑tested Paxos implementation used in thousands of MySQL clusters, providing zero data loss and high reliability.

The system fully implements the MySQL binlog protocol, enabling any open‑source MySQL CDC tool (e.g., Canal) to consume changes directly:

Future Directions

Reduce global‑index latency by thin‑MySQL RPC and eliminating redundant MySQL‑Server layers.

Enhance HTAP capabilities while keeping strong isolation and reasonable cost.

Provide OSS archiving with SQL‑compatible access ( https://zhuanlan.zhihu.com/p/477664175).

Expand multi‑region active‑active support based on extensive internal experience.

Continue open‑source development and keep commercial and community editions code‑base aligned.

PolarDB‑X’s roadmap reflects a commitment to remain MySQL‑compatible both functionally and performance‑wise, while advancing distributed‑transaction, global‑index, and cloud‑native capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

transactionMySQL compatibilityoptimizerglobal indexPolarDB-X
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.