WeBank’s Deployment of Tencent TDSQL: Distributed Database Architecture, High Availability, and Scaling
WeBank adopted Tencent TDSQL, a MySQL‑based distributed database with strong‑consistency replication, auto‑sharding, read/write separation, automatic failover, and a management platform; deployed across two IDC regions using a 3+2 replica model, supporting petabyte data, billions of daily transactions, and high availability.
When WeBank was founded in 2014, it deliberately chose a distributed, internet‑style architecture instead of the traditional centralized IT stack. Partnering with Tencent, it collaborated to adapt Tencent’s financial‑grade distributed database product TDSQL for core banking workloads.
TDSQL is built on the MySQL/MariaDB community code base but adds kernel‑level replication optimizations that provide strong consistency across master‑slave replicas. The system integrates modules such as TDSQL Agent, SQLEngine, and Scheduler, delivering read/write separation, auto‑sharding, automatic failover, real‑time monitoring, and cold‑backup capabilities.
Two deployment modes are offered: noshard (single‑instance, fully MySQL‑compatible, vertical scaling only) and shard (auto‑sharding with distributed transactions, enabling horizontal scaling). The shard mode uses the SQLEngine to present a single logical database view while distributing data across multiple physical instances.
Strong‑consistency replication enables sub‑second master‑slave failover (RPO = 0, RTO ≈ 30 s). A special watch node acts as a read‑only observer that does not participate in elections, allowing flexible role adjustments during failures.
Operational automation is provided by the “赤兔” management platform, which offers visual monitoring, alarm configuration, routine operations (master‑slave switch, node replacement, configuration changes), backup/recovery, slow‑query analysis, and performance diagnostics.
The architecture introduces the concept of a Data Center Node (DCN), a logical unit that contains the full application, access, and database layers. A Global Name Service (GNS) based on Redis cache and TDSQL persistence routes client requests to the appropriate DCN.
WeBank’s IDC layout consists of two regions (Shenzhen and Shanghai). Shenzhen hosts five same‑city IDC rooms, while Shanghai provides a cross‑city disaster‑recovery site. Dedicated high‑speed links keep inter‑IDC latency around 2 ms, satisfying TDSQL’s strong‑sync requirements.
For production, a 3 + 2 replica model is used: three same‑city replicas (1 master + 2 slaves) with strong synchronous replication, plus two cross‑city asynchronous replicas for disaster recovery. This enables true multi‑active deployment—traffic can be served from any of the three same‑city IDC sites, and a failure of any site does not cause data loss.
Performance testing shows that cross‑IDC strong sync adds only ~10 % latency to OLTP workloads. For batch jobs, a watch node is co‑located with the master to avoid cross‑IDC latency penalties.
At present, the TDSQL cluster comprises over 350 SET groups, more than 1,700 database instances, and stores petabytes of data, handling peak transaction volumes of >3.6 billion per day and >100 k TPS. The system has operated for four years without major outages.
Future plans include integrating TiDB for ultra‑large‑scale workloads, expanding Redis and MongoDB usage, and deeper collaboration with the Tencent Cloud TDSQL team on intelligent operations (e.g., the “扁鹊” project) and MySQL 8.0/MGR features.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.