Databases 20 min read

How We Migrated a Billion‑Transaction Payment Platform from MySQL to TiDB

This article details the challenges and solutions encountered when migrating Fengchao's high‑throughput payment platform—handling over a hundred‑billion rows and millions of QPS—from a sharded MySQL architecture to TiDB, covering performance bottlenecks, TiDB advantages, migration steps, tooling, pitfalls, and optimization recommendations.

21CTO
21CTO
21CTO
How We Migrated a Billion‑Transaction Payment Platform from MySQL to TiDB

1. Payment Platform Status and Problems

Before migration, the payment platform ran entirely on MySQL with four physical machines (four masters), more than 40 databases, over 500 sharded tables, and a total data volume of hundreds of billions of rows. Peak QPS (outside of Double‑11) exceeded 7K. The classic sharding solution caused several bottlenecks:

Secondary scaling of sharding : Expanding from 21 to 41 databases and from 21 to 480 tables required costly data migration tools and extensive business code changes.

Cross‑database transaction consistency : MySQL required custom eventual‑consistency logic and manual intervention in extreme cases.

Non‑sharding dimension queries : Queries that did not use the user‑ID sharding key could not be satisfied by MySQL and needed asynchronous sync to Elasticsearch.

Heterogeneous data stores : Synchronizing MySQL to Elasticsearch introduced manual effort for schema changes.

DDL on massive tables : MySQL 5.6 performed DDL on huge tables very slowly, wasting DBA time.

Complex DDL changes : Thousands of sharded tables made DDL verification error‑prone and time‑consuming.

These issues are common for companies using MySQL sharding for OLTP workloads.

2. Advantages of TiDB

TiDB offers several benefits that directly address the above problems:

Online DDL without locks : Fast schema changes without table‑level locking.

Elastic horizontal scaling : Adding new nodes instantly expands capacity.

High MySQL compatibility : Most applications migrate without code changes.

Strong distributed ACID transactions : Raft‑based consensus provides 100% data consistency.

Financial‑grade high availability : Raft majority election ensures automatic failover with sub‑minute impact.

These advantages motivated the decision to use TiDB as the core solution for the payment platform.

3. Migration Journey

3.1 Standard Migration Steps

The team defined a standardized migration workflow, repeating only the necessary steps while keeping the overall process consistent.

3.2 Data Synchronization Strategy

Three synchronization mechanisms were employed:

Full data dump to TiDB : Using mydumper on four MySQL slaves and loader to import data into TiDB. The process took dozens of hours due to >100 billion rows.

Real‑time incremental sync : PingCAP's sync tool read MySQL binlogs and streamed changes to TiDB.

Bidirectional sync (TiDB → MySQL) : Two custom tools were built to ensure reliable reverse replication.

3.3 Traffic Replay (DBReplay)

A custom tool, DBReplay , records production traffic at the network layer and replays it on TiDB to uncover issues such as TiDB bugs, load‑balancer problems, SQL incompatibilities, and slow queries before the final cut‑over.

3.4 Gray‑Release Plan

The team opted for simultaneous read‑write gray release to TiDB, configuring dynamic data‑source switches via a configuration center and using dbmove for reverse sync when needed. Four gray‑release stages gradually increased the number of sharded databases routed to TiDB.

3.5 Fast Rollback (DBSwitch)

DBSwitch monitors TiDB health via periodic API calls and decides whether to switch back to MySQL. It also checks pending messages in dbmove, toggles database switches in the configuration center, and restarts the payment platform to release connection pools.

4. Migration Pitfalls

4.1 Full‑load Freeze

Two root causes were identified: a TiKV 2.1.9 bug in leader transfer causing temporary slow queries, and an Nginx TCP proxy issue that was resolved by upgrading to Nginx 1.17.1 with the official stream module.

4.2 Sudden SQL Slowdown

Missing timely ANALYZE TABLE caused query performance degradation. The team scheduled three daily analyze jobs per table and highlighted the need for a custom auto‑analyze service.

4.3 Transaction Result Inconsistency

TiDB’s automatic transaction retry could lead to unexpected results. Disabling it via the tidb_disable_txn_auto_retry variable (default ON in TiDB 3.0) resolves the issue.

4.4 Scheduled Task Failures

Large‑batch deletes exceeded TiDB’s transaction size limit (default 300 k rows), triggering “9500 – transaction is too large”. The recommendation is to split large operations into smaller batches.

5. Optimization Recommendations

5.1 Task Scheduling Mechanism

Running a dedicated Java service for periodic ANALYZE TABLE introduces programming skill requirements, monitoring overhead, and risk of omission. A more integrated solution is suggested.

5.2 Reverse Sync Mechanism

TiDB’s drainer supports MySQL as a target but not multiple MySQL instances. The team relies on dbmove as a fallback for reverse synchronization.

6. Conclusion

TiDB’s active community, rapid releases, and mature ecosystem make it a reliable choice for large‑scale OLTP workloads. Fengchao’s experience demonstrates that all of its MySQL‑based services can be confidently migrated to TiDB, enabling long‑term database strategy consolidation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed databasemysqlTiDBdatabase migrationpayment platform
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.