Databases 19 min read

How to Efficiently Shard Billion‑Row Tables with ShardingSphere and Spring Boot

This article walks through the end‑to‑end design, configuration, and implementation of splitting massive loan and repayment tables into 50 sharded MySQL tables using ShardingSphere, Spring Boot, and a custom suffix algorithm, while covering data migration, DBA coordination, dynamic switches, and scheduled consistency checks.

Top Architect
Top Architect
Top Architect
How to Efficiently Shard Billion‑Row Tables with ShardingSphere and Spring Boot

Preface

After a year of writing business code, the author received a request to split the loan and repayment application tables to improve query efficiency for tables containing tens of millions of rows.

Design Plan

The design includes creating a new database instance for the split tables, using 50 tables named CashRepayApplySplit0${0..9} and CashRepayApplySplit${10..49}, with the sharding column memberId and a suffix algorithm that takes the last two digits of the member ID modulo 50, padding with zero when needed.

Historical Data Synchronization

Existing full data resides in CashRepayInfo. The migration first copies missing early data from CashRepayInfo to the split tables, then synchronizes the remaining data, and finally uses a nightly batch to copy any newly created rows.

Diagram of data relationship
Diagram of data relationship

Backend Query Refactor

Because the split tables are in a separate data source, the original single‑table join queries must be rewritten; recent two‑three years of data stay in the original table and are queried normally, while older data are accessed via the memberId filter on the split tables.

Dynamic Switch

A runtime switch controls whether new business writes are duplicated to the split tables, allowing a staged migration.

Scheduled Consistency Check

A daily task scans today’s CashRepayApply and CashRepayApplySplit rows, compares them, and raises alerts if differences are found.

Implementation Details

The project uses Spring Boot 3.2.4, JDK 19, MySQL 8, ShardingSphere 5.4.1 and MyBatis‑Plus 3.5.5. Configuration files application.yml, sharding-config.yaml and Maven dependencies are provided. The sharding algorithm expression is written in Groovy as Long.parseLong(member_id) % 50 with zero‑padding.

@Transactional(propagation = Propagation.REQUIRES_NEW, transactionManager = "transactionManagerSplit")

The author also shares a simple batch method that processes 500 rows at a time, handling null checks, upserts to the split tables, and a Redis flag to terminate the loop safely.

Conclusion

The article demonstrates that sharding a multi‑billion‑row table can be achieved with careful design, coordination with DBAs, and incremental migration, turning a seemingly daunting task into a manageable engineering effort.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data MigrationshardingSpring BootmysqlShardingSpheredatabase partitioning
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.