Splitting a Massive MySQL Financial Transaction Table: Challenges, Design, and Implementation
The article describes how a finance team tackled a 50‑million‑row MySQL transaction table by analyzing the pain points, defining split goals, selecting sharding‑jdbc, designing multi‑source transaction handling, pagination across shards, data migration strategies, and a staged rollout to ensure stability and performance.
The author inherited a financial system with a single MySQL table exceeding 50 million rows and growing by 600 k rows per month, causing timeouts, slow inserts, large storage consumption, and locking issues.
To keep the system maintainable, the team set two main objectives: split the large table into multiple sub‑tables each holding roughly 10 million rows, and optimize query conditions to eliminate slow MySQL queries.
Key challenges included the table’s critical role across many business scenarios (26 scenarios, 32 mapper methods), the need for reliable migration without downtime, coordination with downstream systems, and ensuring data consistency during the split.
The solution adopted sharding‑jdbc as the sharding middleware, chosen for its lightweight Maven integration and support for multiple sharding strategies, while discarding an Elasticsearch approach that did not fit the use case.
Horizontal sharding based on the "transaction time" field was selected because it is always present, distributes data evenly, and satisfies the 10 million‑row per table target; each month’s data roughly fits this size.
Technical difficulties such as multi‑data‑source transaction management were addressed with custom annotations and AOP‑based transaction control; pagination across shards was solved by converting a global offset/pageSize into per‑shard offsets and sizes using a multi‑threaded counting and allocation algorithm.
Data migration was planned in two parts: cold data (older than three months) migrated incrementally via custom code, and hot data (last three months) migrated in a short maintenance window by the DBA, minimizing impact on the live system.
The rollout was divided into three phases: (1) create shard tables, migrate data, enable dual‑write, and route all queries to shards; (2) stop writes to the old table and switch external interfaces to the new shards; (3) decommission the original large table.
Final recommendations emphasize further middleware research, careful thread‑pool sizing, exhaustive scenario mapping, robust migration and rollback plans, and the importance of soft skills such as communication and coordination for backend engineers.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.