Design and Implementation of Table Sharding for Loan Repayment Applications Using ShardingSphere and Spring Boot
This article describes how a senior architect designed, configured, and implemented a 50‑table sharding solution for loan and repayment request data using ShardingSphere, Spring Boot 3, MySQL, and custom synchronization scripts, while also addressing historical data migration, backend query changes, and operational safeguards.
Preface
After a year of writing business code, the team received a request to split the loan and repayment request tables to improve query performance. The loan table contains tens of millions of rows, while the repayment table exceeds two hundred million rows.
Design Plan
Before coding, the system design was aligned with leadership on several points.
Sharding Database Basic Information
Use a new instance and independent data source for the sharding database.
No separate database is needed, only 50 tables: 50 tables are sufficient.
Database name: tcbiz_ins_cash_mas_split (named by leadership).
Total number of tables: 50 .
Table naming pattern: CashRepayApplySplit0${0..9} , CashRepayApplySplit${10..49} (camel‑case names).
Sharding column: memberId . Sharding algorithm: memberId % 50 , pad left with 0 to keep a two‑digit suffix.
Table structure identical to the original table, but the uptTime column (which lacked an index) must be indexed in the sharding tables.
Historical Data Synchronization
After creating 50 tables, both new business data and existing data must be written according to the sharding strategy. The original data resides in CashRepayInfo , which was previously migrated to CashRepayApply . The goal is to synchronize all full‑history data from CashRepayInfo into the sharded tables.
Synchronize the missing early data from CashRepayInfo to CashRepayApplySplit .
Synchronize the remaining data from CashRepayApply to CashRepayApplySplit .
FAQ
Why not sync directly from CashRepayInfo to the sharded tables? Because the two tables have different schemas; the DBA needs identical structures, so a two‑step approach is required.
Existing Backend Query Refactor
The current admin/operations backend joins the single table directly. After sharding, the original join will no longer work. The interim solution keeps the last two‑to‑three years of data in the original table for normal joins, while older data (>3 years) must be queried by memberId against the sharded tables.
External Department Notification
After full data migration, the legacy CashRepayInfo table will be phased out. Other departments (risk control, big data, etc.) must be informed to switch their reports to the new sharded tables.
DBA Synchronization Process for New Business Data
The synchronization is split into three phases:
Developers sync the earliest missing data.
DBA syncs the remaining bulk data up to a deterministic id (using operatorDate - 5 days as the cutoff).
Developers sync the newest data generated during the DBA window (e.g., after 23:00).
After all three phases, a three‑write switch is enabled to complete the migration.
Data Three‑Write Synchronization
During the transition period, all three tables ( CashRepayInfo , CashRepayApply , CashRepayApplySplit ) must stay consistent. Writes occur first to CashRepayApply , then to CashRepayInfo , and finally to the sharded tables.
Transactional Configuration
@Transactional(propagation = Propagation.REQUIRES_NEW, transactionManager = "transactionManagerSplit")Real‑time code‑based synchronization is used instead of middleware such as MySQL binlog because DBA resources are limited.
Dynamic Switch for Synchronization
A runtime switch controls whether new business data is written to the sharded tables, allowing a staged rollout.
Scheduled Task Scanning
A daily scheduled task compares data between the original and sharded tables, raising alerts if discrepancies are found.
Demo & Implementation Details
The demo uses Spring Boot 3.2.4, JDK 19, MySQL 8, ShardingSphere 5.4.1, and MyBatis‑Plus 3.5.5.
Maven Dependencies
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.19</version>
</dependency>
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>shardingsphere-jdbc-core</artifactId>
<version>5.4.1</version>
</dependency>
<!-- MyBatis‑Plus 3 -->
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-spring-boot3-starter</artifactId>
<version>3.5.5</version>
</dependency>
<dependency>
<groupId>com.sun.xml.bind</groupId>
<artifactId>jaxb-impl</artifactId>
<version>2.3.3</version>
</dependency>application.yml
spring:
application:
name: sharding-jdbc-demo
datasource:
driver-class-name: org.apache.shardingsphere.driver.ShardingSphereDriver
url: jdbc:shardingsphere:classpath:sharding-config.yaml # sharding config file
profiles:
active: default
mybatis-plus:
configuration:
log-impl: org.apache.ibatis.logging.stdout.StdOutImplsharding-config.yaml
dataSources:
ds0:
dataSourceClassName: com.zaxxer.hikari.HikariDataSource
driverClassName: com.mysql.cj.jdbc.Driver
jdbcUrl: jdbc:mysql://localhost:3306/tcbiz_ins?characterEncoding=utf8&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
username: root
password: your_password
rules:
- !SHARDING
tables:
cash_repay_apply:
actualDataNodes: ds0.cash_repay_apply_0$->{0..9},ds0.cash_repay_apply_$->{10..49}
tableStrategy:
standard:
shardingColumn: member_id
shardingAlgorithmName: memberId-suffix-mode-50
shardingAlgorithms:
memberId-suffix-mode-50:
type: INLINE
props:
algorithm-expression: cash_repay_apply_$->{Long.parseLong(member_id) % 50}.padLeft(2,'0')
props:
sql-show: trueMapper Example
@Mapper
public interface CashRepayApplyMapper extends BaseMapper
{
@Select("select * from cash_repay_apply where member_id = #{memberId}")
List
testSharding(@Param("memberId") String memberId);
}Calling this method lets ShardingSphere route the query to the correct table automatically.
Pitfalls Encountered
Spring Boot 3.2.4 requires the jaxb-impl dependency.
The org.yaml.snakeyaml.representer.Representer class lacks a no‑arg constructor; a custom wrapper class was added to fix the NoSuchMethodError .
Groovy inline sharding expressions originally used member_id % 50 , which caused a MissingMethodException . Converting the value to Long resolved the issue.
Data Synchronization Code (Developer Side)
/**
* Synchronize data example code
*/
public void dataHandle(Long startId, Long endId) throws AppBizException {
log.info("CashRepayApplyServiceImpl#dataHandle start startId-{},endId-{}", startId, endId);
if (endId <= 0L) {
endId = null;
} else if (startId > endId) {
throw new AppBizException(ErrorCodeEnum.PARAM_CHECK_ERROR.getCode(), "Start ID cannot be greater than end ID");
}
QueryCashRepayInfoCondition condition = new QueryCashRepayInfoCondition();
condition.setGteId(startId);
condition.setLteId(endId);
condition.setOrders("id+"); // ascending order
List
infoList = cashRepayInfoDao.query(condition, 0, -1); // max 500 rows per batch
long lastId;
while (CollectionUtil.isNotEmpty(infoList)) {
lastId = infoList.get(infoList.size() - 1).getId() + 1;
infoList.forEach(history -> {
try {
if (StringUtil.isBlank(history.getMemberId()) || StringUtil.isBlank(history.getRepayNo())) {
log.error("CashRepayApplyServiceImpl#dataHandle error memberId or repayNo is null id-{}", history.getId());
return;
}
QueryCashRepayApplySplitCond splitCond = new QueryCashRepayApplySplitCond();
splitCond.setMemberId(history.getMemberId());
splitCond.setRepayApplyNo(history.getRepayNo());
CashRepayApplySplit exist = cashRepayApplySplitDao.getUnique(splitCond);
CashRepayApplySplit splitData = buildCashRepayApplySplit(history);
if (exist == null) {
cashRepayApplySplitDao.add(splitData);
} else {
splitData.setId(exist.getId());
cashRepayApplySplitDao.update(splitData);
}
} catch (Exception e) {
log.error("CashRepayApplyServiceImpl#dataHandle error id-{},repayNo-{}", history.getId(), history.getRepayNo());
throw new RuntimeException(e);
}
});
LOG.info("dataHandle finish startId-{},endId-{}", condition.getGteId(), endId);
String redisCache = RedisCacheUtils.getRedisCache(CashApplyRedisKeyConsts.TERMINATE_SPLIT_SYNC_DATA);
if (StringUtil.isNotEmpty(redisCache)) {
LOG.info("CashRepayApplyServiceImpl#dataHandle need terminate loop startId-{}", condition.getGteId());
break;
}
condition.setGteId(lastId);
infoList = cashRepayInfoDao.query(condition, 0, -1);
}
}
private CashRepayApplySplit buildCashRepayApplySplit(CashRepayInfo history) {
CashRepayApplySplit split = new CashRepayApplySplit();
CashRepayApply apply = cashRepayApplyDao.get(history.getRepayNo());
if (apply != null) {
BeanUtils.copyProperties(apply, split, "id");
return split;
}
// ... populate fields from CashRepayInfo ...
return split;
}Conclusion
The author reflects that sharding is not as daunting as interview questions suggest; it mainly involves reading documentation, configuring the framework, and invoking the appropriate APIs. The real challenge lies in designing a migration plan that coordinates multiple teams, preserves existing functionality, and ensures data consistency.
With this experience, the team now has practical knowledge of handling billion‑level data through table sharding.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.