Databases 18 min read

Design and Implementation of TDSQL: A Distributed High‑Consistency MySQL‑Based Storage System

The article describes the evolution, architecture, automatic scaling, sharding, disaster‑recovery mechanisms, and strong synchronous replication strategy of TDSQL, a MySQL‑engine based distributed database built to meet the high‑availability and high‑consistency requirements of Tencent's billing platform.

Art of Distributed System Architecture Design

Jul 5, 2015

Design and Implementation of TDSQL: A Distributed High‑Consistency MySQL‑Based Storage System

Tencent's billing platform manages over 90% of the company's virtual accounts and requires extremely high availability and consistency across disaster scenarios; therefore, the team has focused on building a high‑consistency storage system.

The solution has progressed through three stages and now adopts a MySQL‑engine based distributed SQL system.

Key motivations for moving away from the previous in‑memory NoSQL HOLD platform include limited query capabilities, inefficient use of memory for cold data, and the need for a more powerful, general‑purpose storage system for real‑time payment data.

The new design retains the MySQL protocol, provides automatic cross‑IDC disaster‑recovery with guaranteed transaction durability, offers flexible capacity scaling, and targets OLTP workloads.

Overall Architecture

The system consists of three modules—Scheduler, Agent, and Gateway—communicating via ZooKeeper. Scheduler manages sets, DDL distribution, node health monitoring, automatic failover, and resource‑aware scaling. Agent periodically checks local MySQL instances for read/write health, replication lag, resource usage, and executes scaling or failover tasks. The Gateway, built on MySQL Proxy, parses SQL, routes requests, performs authentication, records execution metrics, and merges results for aggregation queries.

Automatic Scaling

Horizontal scaling is preferred; the system shards logical tables into up to 10,000 physical tables (shards) based on a designated shard column. The Gateway extracts the shard value to route queries to the appropriate set, and if absent, queries all relevant sets and aggregates results.

The scaling workflow (Strategy 2: move‑then‑switch) creates target tables, uses mysqldump with a WHERE clause to copy relevant rows, streams inserts to the new set, tracks binlog positions, renames the old table, replays remaining binlog entries, and finally updates routing information via ZooKeeper.

Disaster Recovery

Three replication models were evaluated: asynchronous, semi‑synchronous, and various Galera‑based clusters. Due to cross‑IDC latency, semi‑synchronous and Galera showed unacceptable performance loss, so a custom strong‑synchronous solution was implemented.

The strong‑synchronous design splits the SQL execution path: the first half writes the binlog and saves the session, while a separate dump thread sends the binlog to replicas; replicas acknowledge via UDP, and dedicated threads on the primary commit the transaction after receiving the acknowledgment, eliminating blocking waits.

Performance tests show significant latency reduction compared to the original model.

High‑Availability Enhancements

Each set should contain at least three cross‑IDC nodes, with optional read‑only replicas. Nodes can be added dynamically using XtraBackup for full and incremental data copy, achieving roughly 500 GB per hour transfer rates. Additional safeguards handle abrupt primary crashes by replaying row‑based binlogs and, if necessary, performing full data reconstruction.

Future Directions

Planned work includes tighter integration with Docker for resource scheduling, deeper exploration of Galera clusters, combining thread pools with coroutines for higher performance, and extending support for limited cross‑shard joins and distributed transactions via offline OLAP pipelines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Sharding distributed database mysql replication TDSQL

Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.