Databases 22 min read

Design and Implementation of Table Sharding for Cash Repayment Applications Using ShardingSphere and Spring Boot

This article details the end‑to‑end design, configuration, and code implementation of a 50‑table sharding solution for cash repayment and loan application data, covering database schema planning, historical data migration, backend query adaptation, dynamic switches, scheduled consistency checks, and practical pitfalls encountered with Spring Boot 3 and ShardingSphere.

Selected Java Interview Questions

Jun 25, 2024

Design and Implementation of Table Sharding for Cash Repayment Applications Using ShardingSphere and Spring Boot

Introduction

After a year of writing business code, I received a technical request from leadership to split the loan and repayment application tables to improve query efficiency and reduce database pressure. The loan table contains tens of millions of rows, while the repayment table exceeds two hundred million, making sharding a viable solution.

Design Plan

Before coding, we need to align on several design aspects with leadership.

Basic Information of Sharded Database

Use a new instance with an independent data source.

No need for separate databases, only tables; 50 tables are sufficient.

Database name: tcbiz_ins_cash_mas_split (named by leadership).

Total number of tables: 50.

Table naming pattern: CashRepayApplySplit0${0..9}, CashRepayApplySplit${10..49} (camel‑case naming).

Sharding column: memberId, sharding strategy: take the last two digits of memberId % 50, pad with 0 if the remainder is less than 10 to match the two‑digit suffix.

Table structure identical to the original, including fields and indexes. Notably, the original uptTime column lacks an index, which must be added in the sharded tables.

Historical Data Synchronization

After creating fifty tables, both new business data and existing data must be written to the appropriate sharded tables according to the sharding strategy.

Previously, the repayment data migrated from CashRepayInfo to CashRepayApply. The current double‑write flow writes to CashRepayApply first, then synchronizes to CashRepayInfo to keep a complete historical copy.

We need to:

Synchronize the missing early CashRepayInfo data to the new sharded tables CashRepayApplySplit.

Synchronize the remaining CashRepayApply data to CashRepayApplySplit.

FAQ

Why not sync directly from CashRepayInfo to the sharded tables? Because the data volume exceeds two hundred million rows, making a full‑scale code‑driven sync impractical; the DBA must handle it, but schema differences between CashRepayInfo and CashRepayApply prevent a straightforward DBA‑only migration.

Specific Details

To locate the starting id for the migration, we find the intersection point of the vertical black dashed line with CashRepayInfo.id. This id marks the earliest data that must be copied to the sharded tables.

Using this id, we can locate the corresponding primary key in CashRepayApply and let the DBA synchronize data up to that point.

Backend Query Refactoring

The current admin system joins the single table directly. After sharding, joins are no longer possible, so we keep the recent two‑to‑three years of data in CashRepayApply and continue to query by memberId. Older data (>3 years) must be accessed via the sharded tables, requiring a new query page that mandates the memberId parameter.

External Department Notification

Once all data is fully synchronized to the sharded tables, the legacy CashRepayInfo will be gradually decommissioned. Other departments (risk control, big data, etc.) must be informed to switch their reports to query CashRepayApplySplit.

DBA‑Driven Synchronization of New Business Data

We first let developers sync a portion of data, then provide the DBA with a start id. The DBA syncs up to a deterministic end id based on uptTime < operationDate - 5 days. Any data generated during the DBA operation is later synced by developers at night (e.g., 23:00) after the system is closed for repayments.

Three‑Write Synchronization

During the transition period, we maintain three copies: write to CashRepayApply, then to CashRepayInfo, and finally to CashRepayApplySplit. Because the sharded database uses a different data source, a specific transaction manager must be declared:

<span>@Transactional</span>(propagation = Propagation.REQUIRES_NEW, transactionManager = "transactionManagerSplit")

We avoid using MySQL binlog listeners for real‑time sync due to DBA constraints.

Dynamic Switch for Data Sync

A dynamic switch controls whether new business data is written to the sharded tables. The migration proceeds in three phases: (1) developers sync historical data, (2) DBA syncs a bounded range, (3) developers sync the remaining new data before finally turning on the switch for seamless operation.

Scheduled Task Scanning

Since double‑write operations span different data sources, occasional failures may cause mismatches. A daily scheduled task compares today's data between CashRepayApply and CashRepayApplySplit, raising alerts if discrepancies are found.

Demo Journey

Our project uses an older ShardingSphere version, so we upgraded to a newer Spring Boot 3 compatible version. The official starter stopped updating after 5.3.0, so we integrated ShardingSphere via the driver.

Component Versions

SpringBoot 3.2.4 + JDK19 + MySQL8 + ShardingSphere 5.4.1 + MyBatisPlus 3.5.5

Maven Dependencies

<dependency>
  <groupId>mysql</groupId>
  <artifactId>mysql-connector-java</artifactId>
  <version>8.0.19</version>
</dependency>

<dependency>
  <groupId>org.apache.shardingsphere</groupId>
  <artifactId>shardingsphere-jdbc-core</artifactId>
  <version>5.4.1</version>
</dependency>

<!--mybatisplus3-->
<dependency>
  <groupId>com.baomidou</groupId>
  <artifactId>mybatis-plus-spring-boot3-starter</artifactId>
  <version>3.5.5</version>
</dependency>

<dependency>
  <groupId>com.sun.xml.bind</groupId>
  <artifactId>jaxb-impl</artifactId>
  <version>2.3.3</version>
</dependency>

application.yml

spring:
  application:
    name: sharding-jdbc-demo
  datasource:
    driver-class-name: org.apache.shardingsphere.driver.ShardingSphereDriver
    url: jdbc:shardingsphere:classpath:sharding-config.yaml # sharding config file
  profiles:
    active: default

mybatis-plus:
  configuration:
    log-impl: org.apache.ibatis.logging.stdout.StdOutImpl

sharding-config.yaml

dataSources:
  ds0:
    dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    driverClassName: com.mysql.cj.jdbc.Driver
    jdbcUrl: jdbc:mysql://localhost:3306/tcbiz_ins?characterEncoding=utf8&useAffectedRows=true&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
    username: root
    password: your_db_password

rules:
  - !SHARDING
    tables:
      cash_repay_apply:
        actualDataNodes: ds0.cash_repay_apply_0${0..9},ds0.cash_repay_apply_${10..49}
        tableStrategy:
          standard:
            shardingColumn: member_id
            shardingAlgorithmName: memberId-suffix-mode-50
    shardingAlgorithms:
      memberId-suffix-mode-50:
        type: INLINE
        props:
          algorithm-expression: cash_repay_apply_${Long.parseLong(member_id) % 50}.padLeft(2,'0')

props:
  sql-show: true

After configuration, we can access the database via a mapper:

@Mapper
public interface CashRepayApplyMapper extends BaseMapper<CashRepayApply> {
    @Select("select * from cash_repay_apply where member_id = #{memberId}")
    List<CashRepayApply> testSharding(@Param("memberId") String memberId);
}

Running with Spring Boot 3.2.4 initially caused a NoSuchMethodError in org.yaml.snakeyaml.representer.Representer. The fix was to add a custom class with a no‑arg constructor to override the missing method.

Another issue was the inline sharding expression not matching the column type. Changing the expression to use Long.parseLong(member_id) % 50 resolved the error.

Demo Source Code Download

The complete demo is available on GitHub: https://github.com/yanzhisishui/sharding-jdbc-demo

Developer Data Synchronization Code

For the portion of data that developers must sync themselves, we use a single‑threaded batch process that processes 500 rows at a time:

/**
  * Synchronization example code
  */
public void dataHandle(Long startId, Long endId) throws AppBizException {
    log.info("CashRepayApplyServiceImpl#dataHandle start startId-{},endId-{}", startId, endId);
    if (endId <= 0L) {
        endId = null;
    } else if (startId > endId) {
        throw new AppBizException(ErrorCodeEnum.PARAM_CHECK_ERROR.getCode(), "Start ID cannot be greater than end ID");
    }
    QueryCashRepayInfoCondition condition = new QueryCashRepayInfoCondition();
    condition.setGteId(startId);
    condition.setLteId(endId);
    condition.setOrders("id+"); // ascending order
    List<CashRepayInfo> infoList = cashRepayInfoDao.query(condition, 0, -1); // max 500 rows
    long lastId;
    while (CollectionUtil.isNotEmpty(infoList)) {
        lastId = infoList.get(infoList.size() - 1).getId() + 1;
        infoList.forEach(history -> {
            try {
                if (StringUtil.isBlank(history.getMemberId()) || StringUtil.isBlank(history.getRepayNo())) {
                    log.error("CashRepayApplyServiceImpl#dataHandle error memberId or repayNo is null id-{}", history.getId());
                    return;
                }
                QueryCashRepayApplySplitCond splitCond = new QueryCashRepayApplySplitCond();
                splitCond.setMemberId(history.getMemberId());
                splitCond.setRepayApplyNo(history.getRepayNo());
                CashRepayApplySplit exist = cashRepayApplySplitDao.getUnique(splitCond);
                CashRepayApplySplit splitData = buildCashRepayApplySplit(history);
                if (exist == null) {
                    cashRepayApplySplitDao.add(splitData);
                } else {
                    splitData.setId(exist.getId());
                    cashRepayApplySplitDao.update(splitData);
                }
            } catch (Exception e) {
                log.error("CashRepayApplyServiceImpl#dataHandle error id-{},repayNo-{}", history.getId(), history.getRepayNo());
                throw new RuntimeException(e);
            }
        });
        LOG.info("dataHandle finish startId-{},endId-{}", condition.getGteId(), endId);
        String redisCache = RedisCacheUtils.getRedisCache(CashApplyRedisKeyConsts.TERMINATE_SPLIT_SYNC_DATA);
        if (StringUtil.isNotEmpty(redisCache)) {
            LOG.info("CashRepayApplyServiceImpl#dataHandle need terminate loop startId-{}", condition.getGteId());
            break;
        }
        condition.setGteId(lastId);
        infoList = cashRepayInfoDao.query(condition, 0, -1);
    }
}

Helper method to build the sharded entity:

/**
 * Convert CashRepayInfo to CashRepayApplySplit entity
 */
private CashRepayApplySplit buildCashRepayApplySplit(CashRepayInfo history) {
    CashRepayApplySplit split = new CashRepayApplySplit();
    CashRepayApply apply = cashRepayApplyDao.get(history.getRepayNo());
    if (apply != null) {
        BeanUtils.copyProperties(apply, split, "id");
        return split;
    }
    // ... omitted: populate fields from CashRepayInfo
    return split;
}

Conclusion

Before joining the company, I often faced interview questions about database sharding without practical experience. After implementing this solution, I realized that sharding is not as daunting as it seems—mostly configuration and API calls. The real challenge lies in designing a migration plan that coordinates multiple teams, preserves existing functionality, and ensures data integrity at the company‑wide level.

Even though our sharding scenario is relatively simple, we now have hands‑on experience with billion‑row data partitioning.

References

[1] Is ShardingSphere 4.1.1 version compatible with Spring Boot 3.0.5 version? https://github.com/apache/shardingsphere/issues/27597

[2] Demo download: https://github.com/yanzhisishui/sharding-jdbc-demo

[3] Record of sending tens of millions of messages in a scheduled task: https://juejin.cn/post/7125399759589736479

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Backend Development MySQL ShardingSphere database sharding SpringBoot

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.