Splitting a Massive MySQL Financial Transaction Table: Challenges, Strategies, and Implementation
The article details how a finance team tackled a 50‑million‑row MySQL transaction table by analyzing the pre‑split problems, defining split goals, choosing sharding‑jdbc, addressing multi‑datasource transaction and pagination issues, designing a phased migration and rollout plan, and summarizing lessons learned.
# Introduction
The author took over a company's financial system two years ago and discovered a single MySQL table exceeding 50 million rows, storing transaction flows and serving many downstream systems. The table was growing by over 600 k rows per month and would reach 100 million rows within six months, making it unsustainable.
# System State Before Splitting
Frequent timeouts on transaction‑flow APIs, some interfaces almost unusable.
Slow daily inserts due to heavy write load.
Table occupied excessive disk space, triggering DBA alerts.
Any ALTER operation caused high replication lag and long table locks.
# Splitting Goals
Partition the large transaction table into multiple tables, each holding roughly 10 million rows (a size MySQL handles comfortably).
Optimize query conditions for each interface to eliminate slow queries and ensure both internal and external APIs remain performant.
# Difficulty Analysis
The table is the core of the finance system; many features and downstream services depend on it, requiring extremely careful development, testing, and release processes.
26 business scenarios involve 32 mapper methods, leading to a large number of code changes.
Data volume is huge; migration must keep the system stable.
High‑traffic, critical functionality demands minimal downtime and robust rollback/rollback strategies.
Schema changes affect many other systems, requiring coordinated cross‑team cooperation.
# Overall Process
The team visualized the workflow (image omitted) and proceeded through research, design, implementation, and rollout phases.
# Middleware Research
Sharding‑JDBC was selected as the sharding plugin because it supports multiple sharding strategies, automatic detection of = or IN conditions, and is lightweight (added as a Maven dependency with minimal intrusion).
Although Elasticsearch was considered to accelerate queries, the provided ES service did not match the business scenario, so the team abandoned it and instead used per‑table query threads to improve performance.
# Choosing the Sharding Key
Horizontal sharding was chosen because vertical sharding or fixed‑table splitting cannot prevent unbounded growth of transaction data. The "transaction time" field was selected as the sharding key because it is always present, distributes data evenly (≈600‑700 k rows per month), and appears in 70 % of queries.
# Technical Challenges
Multi‑DataSource Transaction Issue : Sharding‑JDBC requires an independent data source, leading to transaction problems across multiple data sources. The team solved this by creating a custom annotation and AOP aspect to manage transaction boundaries manually.
Cross‑Table Pagination : After sharding, the original LIMIT‑based pagination no longer works because each shard returns a different row count. The solution involves:
Querying each shard in parallel to get the count of matching rows.
Accumulating counts to build a global offset axis.
Determining which shards contain the first and last rows of the requested page.
Setting offset/pageSize for the first and last shards accordingly, while other shards use offset = 0 and pageSize = total rows.
An illustration (image omitted) shows how a global request (offset = 8, pageSize = 20) is transformed into three shard‑specific requests.
# Data Migration Plan
Two migration approaches were evaluated: DBA‑driven migration and custom code migration. The final strategy combined both:
Cold data (transactions older than three months) are migrated incrementally via custom code (“ants moving”).
Hot data (last three months) is migrated by the DBA after a brief write‑stop window before go‑live, limiting downtime to about two hours.
Both approaches limit the amount of data moved per batch to avoid high latency on the primary instance.
# Rollout Process
Phase 1: Create sharded tables, migrate data, enable dual‑write (old + new tables), route all queries to sharded tables for validation.
Phase 2: Stop writes to the old table, switch business services to use the new sharded tables via a unified API, continue validation.
Phase 3: Decommission the original large table.
# Summary
Further research on sharding middleware is needed; sharding‑JDBC’s features were under‑utilized due to the specific sharding key.
Thread‑pool sizing must be carefully analyzed to avoid exhausting CPU resources.
When refactoring an existing project, enumerate all business scenarios down to each class and method to ensure complete coverage.
Data migration plans must include consistency checks and fallback procedures.
Design robust rollback and degradation strategies before releasing complex changes.
# Side Note
The author reflects on the importance of communication and soft skills for backend engineers, emphasizing that technical competence alone is insufficient for successful project delivery.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
