Databases 23 min read

Design and Implementation of Table Sharding for Loan Repayment Applications Using ShardingSphere and Spring Boot

This article describes how a senior architect designed, configured, and implemented a 50‑table sharding solution for loan and repayment request data using ShardingSphere, Spring Boot 3, MySQL, and custom synchronization scripts, while also addressing historical data migration, backend query changes, and operational safeguards.

Top Architect

Nov 12, 2024

Design and Implementation of Table Sharding for Loan Repayment Applications Using ShardingSphere and Spring Boot

Preface

After a year of writing business code, the team received a request to split the loan and repayment request tables to improve query performance. The loan table contains tens of millions of rows, while the repayment table exceeds two hundred million rows.

Design Plan

Before coding, the system design was aligned with leadership on several points.

Sharding Database Basic Information

Use a new instance and independent data source for the sharding database.

No separate database is needed, only 50 tables: 50 tables are sufficient.

Database name: tcbiz_ins_cash_mas_split (named by leadership).

Total number of tables: 50.

Table naming pattern: CashRepayApplySplit0${0..9}, CashRepayApplySplit${10..49} (camel‑case names).

Sharding column: memberId. Sharding algorithm: memberId % 50, pad left with 0 to keep a two‑digit suffix.

Table structure identical to the original table, but the uptTime column (which lacked an index) must be indexed in the sharding tables.

Historical Data Synchronization

After creating 50 tables, both new business data and existing data must be written according to the sharding strategy. The original data resides in CashRepayInfo, which was previously migrated to CashRepayApply. The goal is to synchronize all full‑history data from CashRepayInfo into the sharded tables.

Synchronize the missing early data from CashRepayInfo to CashRepayApplySplit.

Synchronize the remaining data from CashRepayApply to CashRepayApplySplit.

FAQ

Why not sync directly from CashRepayInfo to the sharded tables? Because the two tables have different schemas; the DBA needs identical structures, so a two‑step approach is required.

Existing Backend Query Refactor

The current admin/operations backend joins the single table directly. After sharding, the original join will no longer work. The interim solution keeps the last two‑to‑three years of data in the original table for normal joins, while older data (>3 years) must be queried by memberId against the sharded tables.

External Department Notification

After full data migration, the legacy CashRepayInfo table will be phased out. Other departments (risk control, big data, etc.) must be informed to switch their reports to the new sharded tables.

DBA Synchronization Process for New Business Data

The synchronization is split into three phases:

Developers sync the earliest missing data.

DBA syncs the remaining bulk data up to a deterministic id (using operatorDate - 5 days as the cutoff).

Developers sync the newest data generated during the DBA window (e.g., after 23:00).

After all three phases, a three‑write switch is enabled to complete the migration.

Data Three‑Write Synchronization

During the transition period, all three tables ( CashRepayInfo, CashRepayApply, CashRepayApplySplit) must stay consistent. Writes occur first to CashRepayApply, then to CashRepayInfo, and finally to the sharded tables.

Transactional Configuration

@Transactional(propagation = Propagation.REQUIRES_NEW, transactionManager = "transactionManagerSplit")

Real‑time code‑based synchronization is used instead of middleware such as MySQL binlog because DBA resources are limited.

Dynamic Switch for Synchronization

A runtime switch controls whether new business data is written to the sharded tables, allowing a staged rollout.

Scheduled Task Scanning

A daily scheduled task compares data between the original and sharded tables, raising alerts if discrepancies are found.

Demo & Implementation Details

The demo uses Spring Boot 3.2.4, JDK 19, MySQL 8, ShardingSphere 5.4.1, and MyBatis‑Plus 3.5.5.

Maven Dependencies

<dependency>
  <groupId>mysql</groupId>
  <artifactId>mysql-connector-java</artifactId>
  <version>8.0.19</version>
</dependency>

<dependency>
  <groupId>org.apache.shardingsphere</groupId>
  <artifactId>shardingsphere-jdbc-core</artifactId>
  <version>5.4.1</version>
</dependency>

<!-- MyBatis‑Plus 3 -->
<dependency>
  <groupId>com.baomidou</groupId>
  <artifactId>mybatis-plus-spring-boot3-starter</artifactId>
  <version>3.5.5</version>
</dependency>

<dependency>
  <groupId>com.sun.xml.bind</groupId>
  <artifactId>jaxb-impl</artifactId>
  <version>2.3.3</version>
</dependency>

application.yml

spring:
  application:
    name: sharding-jdbc-demo
  datasource:
    driver-class-name: org.apache.shardingsphere.driver.ShardingSphereDriver
    url: jdbc:shardingsphere:classpath:sharding-config.yaml # sharding config file
  profiles:
    active: default

mybatis-plus:
  configuration:
    log-impl: org.apache.ibatis.logging.stdout.StdOutImpl

sharding-config.yaml

dataSources:
  ds0:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    driverClassName: com.mysql.cj.jdbc.Driver
    jdbcUrl: jdbc:mysql://localhost:3306/tcbiz_ins?characterEncoding=utf8&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
    username: root
    password: your_password

rules:
  - !SHARDING
    tables:
      cash_repay_apply:
        actualDataNodes: ds0.cash_repay_apply_0$->{0..9},ds0.cash_repay_apply_$->{10..49}
        tableStrategy:
          standard:
            shardingColumn: member_id
            shardingAlgorithmName: memberId-suffix-mode-50
    shardingAlgorithms:
      memberId-suffix-mode-50:
        type: INLINE
        props:
          algorithm-expression: cash_repay_apply_$->{Long.parseLong(member_id) % 50}.padLeft(2,'0')

props:
  sql-show: true

Mapper Example

@Mapper
public interface CashRepayApplyMapper extends BaseMapper<CashRepayApply> {
    @Select("select * from cash_repay_apply where member_id = #{memberId}")
    List<CashRepayApply> testSharding(@Param("memberId") String memberId);
}

Calling this method lets ShardingSphere route the query to the correct table automatically.

Pitfalls Encountered

Spring Boot 3.2.4 requires the jaxb-impl dependency.

The org.yaml.snakeyaml.representer.Representer class lacks a no‑arg constructor; a custom wrapper class was added to fix the NoSuchMethodError.

Groovy inline sharding expressions originally used member_id % 50, which caused a MissingMethodException. Converting the value to Long resolved the issue.

Data Synchronization Code (Developer Side)

/**
 * Synchronize data example code
 */
public void dataHandle(Long startId, Long endId) throws AppBizException {
    log.info("CashRepayApplyServiceImpl#dataHandle start startId-{},endId-{}", startId, endId);
    if (endId <= 0L) {
        endId = null;
    } else if (startId > endId) {
        throw new AppBizException(ErrorCodeEnum.PARAM_CHECK_ERROR.getCode(), "Start ID cannot be greater than end ID");
    }
    QueryCashRepayInfoCondition condition = new QueryCashRepayInfoCondition();
    condition.setGteId(startId);
    condition.setLteId(endId);
    condition.setOrders("id+"); // ascending order
    List<CashRepayInfo> infoList = cashRepayInfoDao.query(condition, 0, -1); // max 500 rows per batch
    long lastId;
    while (CollectionUtil.isNotEmpty(infoList)) {
        lastId = infoList.get(infoList.size() - 1).getId() + 1;
        infoList.forEach(history -> {
            try {
                if (StringUtil.isBlank(history.getMemberId()) || StringUtil.isBlank(history.getRepayNo())) {
                    log.error("CashRepayApplyServiceImpl#dataHandle error memberId or repayNo is null id-{}", history.getId());
                    return;
                }
                QueryCashRepayApplySplitCond splitCond = new QueryCashRepayApplySplitCond();
                splitCond.setMemberId(history.getMemberId());
                splitCond.setRepayApplyNo(history.getRepayNo());
                CashRepayApplySplit exist = cashRepayApplySplitDao.getUnique(splitCond);
                CashRepayApplySplit splitData = buildCashRepayApplySplit(history);
                if (exist == null) {
                    cashRepayApplySplitDao.add(splitData);
                } else {
                    splitData.setId(exist.getId());
                    cashRepayApplySplitDao.update(splitData);
                }
            } catch (Exception e) {
                log.error("CashRepayApplyServiceImpl#dataHandle error id-{},repayNo-{}", history.getId(), history.getRepayNo());
                throw new RuntimeException(e);
            }
        });
        LOG.info("dataHandle finish startId-{},endId-{}", condition.getGteId(), endId);
        String redisCache = RedisCacheUtils.getRedisCache(CashApplyRedisKeyConsts.TERMINATE_SPLIT_SYNC_DATA);
        if (StringUtil.isNotEmpty(redisCache)) {
            LOG.info("CashRepayApplyServiceImpl#dataHandle need terminate loop startId-{}", condition.getGteId());
            break;
        }
        condition.setGteId(lastId);
        infoList = cashRepayInfoDao.query(condition, 0, -1);
    }
}

private CashRepayApplySplit buildCashRepayApplySplit(CashRepayInfo history) {
    CashRepayApplySplit split = new CashRepayApplySplit();
    CashRepayApply apply = cashRepayApplyDao.get(history.getRepayNo());
    if (apply != null) {
        BeanUtils.copyProperties(apply, split, "id");
        return split;
    }
    // ... populate fields from CashRepayInfo ...
    return split;
}

Conclusion

The author reflects that sharding is not as daunting as interview questions suggest; it mainly involves reading documentation, configuring the framework, and invoking the appropriate APIs. The real challenge lies in designing a migration plan that coordinates multiple teams, preserves existing functionality, and ensures data consistency.

With this experience, the team now has practical knowledge of handling billion‑level data through table sharding.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Sharding MySQL ShardingSphere SpringBoot database partitioning

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.