Databases 11 min read

Splitting a Massive MySQL Financial Transaction Table: Challenges, Sharding Strategy, and Migration Process

This article describes how a finance team tackled a 500 million‑row MySQL transaction table by analyzing pre‑split issues, defining sharding goals, selecting sharding‑jdbc, addressing multi‑datasource transaction and pagination challenges, designing a hybrid data‑migration plan, and executing a three‑stage rollout to ensure system stability and performance.

Architecture Digest
Architecture Digest
Architecture Digest
Splitting a Massive MySQL Financial Transaction Table: Challenges, Sharding Strategy, and Migration Process

The author took over a company's financial system and discovered a single transaction table exceeding 500 million rows and growing by 600 k rows per month, causing timeouts, slow inserts, large storage usage, and locking issues.

Pre‑split system state:

Frequent interface timeouts related to the transaction table.

Very slow daily inserts.

Table occupied excessive disk space, triggering DBA alerts.

Any ALTER operation caused high replication latency and long table locks.

Split goals:

Divide the large table into multiple shards, each around 10 million rows (a comfortable size for MySQL).

Optimize query conditions for each interface to eliminate slow queries and maintain availability.

Middleware research: The team evaluated sharding‑jdbc, noting its support for multiple sharding strategies, lightweight Maven integration, and low intrusion. An initial plan to use Elasticsearch for faster queries was abandoned after compatibility tests.

Sharding basis selection: After analyzing 26 usage scenarios and 32 mapper methods, the team chose horizontal sharding based on the "transaction time" field because it appears in 70 % of queries, distributes data evenly (≈600‑700 k rows per month), and is always present.

Technical challenges:

Multi‑datasource transaction issue: sharding‑jdbc requires an independent datasource, leading to transaction coordination problems. The team solved this with custom annotations and AOP‑based transaction management (code omitted for confidentiality).

Cross‑table pagination: Traditional LIMIT no longer works across shards. The solution involves calculating per‑shard offsets and page sizes based on the global offset/pageSize, using a multi‑threaded query per shard.

Data migration plan: Two approaches were considered – DBA‑driven migration and custom code migration. The final hybrid strategy migrates "cold" data (older than three months) via controlled batch scripts, while "hot" data (last three months) is migrated by the DBA after a brief write‑stop window before go‑live.

Overall rollout process (three stages):

Stage 1 – Create shards, migrate historical data, enable dual‑write (old and new tables) and route all queries to shards for validation.

Stage 2 – Stop writes to the old table, switch business services to the new sharded tables, and continue monitoring.

Stage 3 – Decommission the original large table.

Summary:

Further research on sharding middleware is needed; sharding‑jdbc’s features were under‑utilized and its independent datasource introduced extra transaction complexity.

Thread‑pool sizing must be carefully tuned to avoid exhausting server threads.

Comprehensive scenario mapping is essential when refactoring an existing project.

Data‑migration plans must include consistency checks and rollback strategies.

Robust rollback and degradation measures are critical for complex releases.

Additionally, the author reflects on the importance of communication and soft skills for backend engineers, who must balance business understanding, technical depth, and coordination across teams.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data Migrationmysqldatabase scaling
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.