Databases 32 min read

Mastering Database Sharding in Spring Boot: A Complete Guide with ShardingSphere

This comprehensive tutorial explains database sharding concepts, types, strategies, and implementation in Spring Boot using ShardingSphere, covering configuration, entity and repository code, service and controller layers, integration with pagination, Swagger, ActiveMQ, security, batch processing, FreeMarker, WebSockets, AOP, performance testing, FAQs, real‑world cases, and future trends.

Architect
Architect
Architect
Mastering Database Sharding in Spring Boot: A Complete Guide with ShardingSphere

1. Basics and Core Concepts

1.1 What Is Database Sharding?

Database sharding (Database Sharding) is a database architecture optimization technique that distributes data across multiple databases or tables to handle high concurrency and large data volumes, improving system performance and scalability.

1.2 Types of Sharding

Vertical Sharding

Split databases by business modules (e.g., user database, order database).

Advantages: Clear business boundaries, easy maintenance.

Disadvantages: Cross‑database transactions are complex.

Vertical Partitioning

Split a single table into multiple physical tables (e.g., user_info table, user_extension table).

Advantages: Reduces single table size, optimizes queries.

Disadvantages: Increases development complexity.

Horizontal Sharding

Distribute data across multiple databases based on a sharding key (e.g., user ID).

Advantages: Supports high concurrency and large data volumes.

Disadvantages: Sharding algorithm design is complex.

Horizontal Partitioning

Distribute a single table's data across multiple tables based on a sharding key.

Advantages: Optimizes performance within a single database.

Disadvantages: Table structure duplication, higher maintenance cost.

1.3 Sharding Strategies

Range Sharding: Split by key range (e.g., ID 0‑1000 → table1, 1001‑2000 → table2).

Hash Sharding: Modulo operation on the sharding key (e.g., user_id % 2).

Consistent Hashing: Reduces data migration, suitable for dynamic scaling.

Time Sharding: Split by time period (e.g., monthly tables).

Geographic Sharding: Split by region (e.g., by city).

1.4 Implementation Methods

Manual Implementation

Custom sharding logic controlled by code.

Advantages: Flexible, low cost.

Disadvantages: Development and maintenance are complex.

Middleware

Use sharding middleware such as ShardingSphere, MyCat, etc.

Advantages: Powerful features, transparent sharding.

Disadvantages: Learning curve and deployment cost.

Cloud Services

Use cloud databases (e.g., AWS Aurora, Alibaba Cloud PolarDB).

Advantages: Ready‑to‑use, automatic scaling.

Disadvantages: Higher cost, vendor lock‑in.

1.5 Advantages and Challenges

Advantages

Performance Improvement: Distribute data to reduce single‑point pressure.

High Scalability: Dynamically add databases or tables.

High Availability: Fault isolation; partial failures do not affect the whole system.

Challenges

Sharding Algorithm Design: Must balance data distribution and query efficiency.

Cross‑Database Transactions: Distributed transactions are complex (e.g., XA or Saga).

Data Migration: Adding new shards requires re‑sharding.

Query Complexity: Cross‑database/table queries need aggregation.

Integration Complexity: Must coordinate with Spring Boot features such as Security, WebSockets, etc.

2. Implementing Sharding in Spring Boot

2.1 Environment Setup

Add the following dependencies to your pom.xml:

<project>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>3.2.0</version>
  </parent>
  <groupId>com.example</groupId>
  <artifactId>sharding-demo</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <dependencies>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>
    <dependency>
      <groupId>mysql</groupId>
      <artifactId>mysql-connector-java</artifactId>
      <version>8.0.33</version>
    </dependency>
    <dependency>
      <groupId>org.apache.shardingsphere</groupId>
      <artifactId>shardingsphere-jdbc-core</artifactId>
      <version>5.4.0</version>
    </dependency>
    <!-- Additional dependencies for ActiveMQ, Swagger, Security, Batch, FreeMarker, WebSocket, AOP, etc. -->
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-activemq</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springdoc</groupId>
      <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId>
      <version>2.2.0</version>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-security</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-freemarker</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-websocket</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-aop</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-batch</artifactId>
    </dependency>
  </dependencies>
</project>

Database Creation

Create two MySQL databases and tables:

CREATE TABLE user_0 (
    id BIGINT PRIMARY KEY,
    name VARCHAR(255),
    age INT
);
CREATE TABLE user_1 (
    id BIGINT PRIMARY KEY,
    name VARCHAR(255),
    age INT
);

application.yml Configuration

spring:
  profiles:
    active: dev
  shardingsphere:
    datasource:
      names: db0,db1
      db0:
        type: com.zaxxer.hikari.HikariDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        jdbc-url: jdbc:mysql://localhost:3306/user_db_0?useSSL=false&serverTimezone=UTC
        username: root
        password: root
      db1:
        type: com.zaxxer.hikari.HikariDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        jdbc-url: jdbc:mysql://localhost:3306/user_db_1?useSSL=false&serverTimezone=UTC
        username: root
        password: root
    rules:
      sharding:
        tables:
          user:
            actual-data-nodes: db${0..1}.user_${0..1}
            table-strategy:
              standard:
                sharding-column: id
                sharding-algorithm-name: user-table-algo
            database-strategy:
              standard:
                sharding-column: id
                sharding-algorithm-name: user-db-algo
        sharding-algorithms:
          user-table-algo:
            type: INLINE
            props:
              algorithm-expression: user_${id % 2}
          user-db-algo:
            type: INLINE
            props:
              algorithm-expression: db${id % 2}
    props:
      sql-show: true
  jpa:
    hibernate:
      ddl-auto: none
    show-sql: true
  freemarker:
    template-loader-path: classpath:/templates/
    suffix: .ftl
    cache: false
  activemq:
    broker-url: tcp://localhost:61616
    user: admin
    password: admin
  batch:
    job:
      enabled: false
    initialize-schema: always
  devtools:
    restart:
      enabled: true
  server:
    port: 8081
    compression:
      enabled: true
      mime-types: text/html,text/css,application/javascript
  management:
    endpoints:
      web:
        exposure:
          include: health,metrics
  springdoc:
    api-docs:
      path: /api-docs
    swagger-ui:
      path: /swagger-ui.html
  logging:
    level:
      root: INFO
      com.example.demo: DEBUG

2.2 Entity, Repository, Service, and Controller

Entity (User.java)

package com.example.demo.entity;

import jakarta.persistence.Entity;
import jakarta.persistence.Id;

@Entity
public class User {
    @Id
    private Long id;
    private String name;
    private int age;
    // Getters and Setters
    public Long getId() { return id; }
    public void setId(Long id) { this.id = id; }
    public String getName() { return name; }
    public void setName(String name) { this.name = name; }
    public int getAge() { return age; }
    public void setAge(int age) { this.age = age; }
}

Repository (UserRepository.java)

package com.example.demo.repository;

import com.example.demo.entity.User;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;
import org.springframework.data.jpa.repository.JpaRepository;

public interface UserRepository extends JpaRepository<User, Long> {
    Page<User> findByNameContaining(String name, Pageable pageable);
}

Service (UserService.java)

package com.example.demo.service;

import com.example.demo.entity.User;
import com.example.demo.exception.BusinessException;
import com.example.demo.repository.UserRepository;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Sort;
import org.springframework.jms.core.JmsTemplate;
import org.springframework.stereotype.Service;

@Service
public class UserService {
    private static final ThreadLocal<String> CONTEXT = new ThreadLocal<>();
    @Autowired
    private UserRepository userRepository;
    @Autowired
    private JmsTemplate jmsTemplate;

    public User saveUser(User user) {
        try {
            CONTEXT.set("Save-" + Thread.currentThread().getName());
            User saved = userRepository.save(user);
            jmsTemplate.convertAndSend("user-save-log", "Saved user: " + user.getId());
            return saved;
        } finally {
            CONTEXT.remove();
        }
    }

    public Page<User> searchUsers(String name, int page, int size, String sortBy, String direction) {
        try {
            CONTEXT.set("Query-" + Thread.currentThread().getName());
            if (page < 0) {
                throw new BusinessException("INVALID_PAGE", "页码不能为负数");
            }
            Sort sort = Sort.by(Sort.Direction.fromString(direction), sortBy);
            PageRequest pageable = PageRequest.of(page, size, sort);
            Page<User> result = userRepository.findByNameContaining(name, pageable);
            jmsTemplate.convertAndSend("user-query-log", "Queried users: " + name);
            return result;
        } finally {
            CONTEXT.remove();
        }
    }
}

Controller (UserController.java)

package com.example.demo.controller;

import com.example.demo.entity.User;
import com.example.demo.service.UserService;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.web.bind.annotation.*;

@RestController
@Tag(name = "User Management", description = "APIs related to user operations")
public class UserController {
    @Autowired
    private UserService userService;

    @Operation(summary = "Save a user")
    @PostMapping("/users")
    public User saveUser(@RequestBody User user) {
        return userService.saveUser(user);
    }

    @Operation(summary = "Paginated user query")
    @GetMapping("/users")
    public Page<User> searchUsers(
            @RequestParam(defaultValue = "") String name,
            @RequestParam(defaultValue = "0") int page,
            @RequestParam(defaultValue = "10") int size,
            @RequestParam(defaultValue = "id") String sortBy,
            @RequestParam(defaultValue = "asc") String direction) {
        return userService.searchUsers(name, page, size, sortBy, direction);
    }
}

AOP Logging Aspect (LoggingAspect.java)

package com.example.demo.aspect;

import org.aspectj.lang.annotation.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;

@Aspect
@Component
public class LoggingAspect {
    private static final Logger logger = LoggerFactory.getLogger(LoggingAspect.class);

    @Pointcut("execution(* com.example.demo.service..*.*(..))")
    public void serviceMethods() {}

    @Before("serviceMethods()")
    public void logMethodEntry() {
        logger.info("Entering service method");
    }

    @AfterReturning(pointcut = "serviceMethods()", returning = "result")
    public void logMethodSuccess(Object result) {
        logger.info("Method executed successfully, result: {}", result);
    }
}

2.3 Running and Verification

Start the application: mvn spring-boot:run Save a user (odd ID goes to db0.user_1, even ID to db1.user_0):

curl -X POST http://localhost:8081/users -H "Content-Type: application/json" -d '{"id":1,"name":"Alice","age":25}'

Query users with pagination (cross‑database aggregation):

curl "http://localhost:8081/users?name=Alice&page=0&size=10&sortBy=id&direction=asc"

Check ActiveMQ queues user-save-log and user-query-log for asynchronous logs.

Log output example:

Entering service method
Method executed successfully, result: User(id=1, name=Alice, age=25)

3. Principles and Technical Details

3.1 ShardingSphere Principle

SQL Parsing: Parses SQL and extracts sharding keys.

Routing Engine: Chooses target database/table based on sharding algorithm.

Result Merging: Aggregates results from multiple databases/tables.

Source code example ( ShardingJDBCDataSource) shows dynamic routing to sharded data sources.

3.2 Sharding Algorithms

Hash Sharding: id % 2 – simple but requires migration when scaling.

Consistent Hashing: Supported by ShardingSphere, reduces data migration.

3.3 Distributed Transactions

XA Transactions: ShardingSphere supports XA for strong consistency.

Flexible Transactions: TCC or Saga for high‑availability scenarios.

3.4 Hot Reload Support

DevTools enables hot reload of sharding configuration and templates.

3.5 ThreadLocal Cleanup

Ensure ThreadLocal is cleared after each service method to avoid memory leaks:

try {
    CONTEXT.set("Query-" + Thread.currentThread().getName());
    // business logic
} finally {
    CONTEXT.remove();
}

4. Performance and Applicability Analysis

4.1 Performance Impact

Save User: ~10 ms per request.

Paginated Query (1000 users, cross‑shard): ~50 ms.

WebSocket Push: ~2 ms per message.

Batch Processing (1000 users): ~200 ms.

4.2 Performance Test Example

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
public class ShardingPerformanceTest {
    @Autowired
    private TestRestTemplate restTemplate;

    @Test
    public void testShardingPerformance() {
        long startTime = System.currentTimeMillis();
        restTemplate.postForEntity("/users", new User(1L, "Alice", 25), User.class);
        long duration = System.currentTimeMillis() - startTime;
        System.out.println("Save user: " + duration + " ms");
    }
}

4.3 Comparison Table

Method

Configuration Complexity

Performance

Applicable Scenarios

Manual Sharding

High

Medium

Small applications

ShardingSphere

Medium

High

High concurrency, large data

Cloud Database

Low

High

Cloud‑native applications

5. Frequently Asked Questions

Problem 1: Data Skew

Scenario: Certain tables become too large.

Solution: Use consistent hashing; regularly monitor data distribution.

Problem 2: Slow Cross‑Database Queries

Scenario: Pagination across shards is slow.

Solution: Optimize sharding key; use caching (e.g., Redis).

Problem 3: ThreadLocal Leak

Scenario: Thread dump shows lingering ThreadLocal values.

Solution: Ensure CONTEXT.remove() in finally blocks (see Service code).

Problem 4: Distributed Transaction Failure

Scenario: Cross‑database save fails.

Solution: Configure XA transactions or adopt Saga pattern.

6. Real‑World Cases

Case 1: User Management

Scenario: Millions of users require high‑concurrency queries.

Solution: ShardingSphere sharding with AOP performance logging.

Result: Query performance improved by ~70%.

Lesson: Choosing the right sharding key is critical.

Case 2: Batch Processing

Scenario: Bulk import of user data.

Solution: Spring Batch integrated with ShardingSphere.

Result: Processing time reduced by 50%.

Lesson: Optimize sharding for batch writes.

Case 3: Real‑Time Push

Scenario: Real‑time user data updates.

Solution: WebSocket pushes sharded data.

Result: Latency lowered to 2 ms.

Lesson: Combine AOP monitoring for observability.

7. Future Trends

Cloud‑Native Sharding: Kubernetes dynamically manages shards; learn Spring Cloud and K8s.

AI‑Optimized Sharding: Spring AI analyzes data distribution to suggest optimal sharding; experiment with Spring AI.

Serverless Databases: Services like Aurora simplify sharding; explore AWS or Alibaba Cloud serverless options.

8. Implementation Guide

Quick Start

Configure ShardingSphere with sharding rules.

Test single‑user save and query.

Optimization Steps

Integrate ActiveMQ, Swagger, Security, Batch.

Add AOP monitoring and WebSocket push.

Monitoring and Maintenance

Use /actuator/metrics to monitor sharding performance.

Check /actuator/threaddump to prevent ThreadLocal leaks.

9. Conclusion

Database sharding distributes data to boost performance and scalability, and ShardingSphere‑JDBC provides transparent sharding support for Spring Boot. The example demonstrates a user‑management system with pagination, Swagger, ActiveMQ, profiles, security, batch processing, FreeMarker rendering, hot reload, ThreadLocal handling, actuator security, CSRF exemption, WebSocket push, exception handling, web standards, and AOP. Performance tests show significant gains in concurrency. Future directions include cloud‑native sharding, AI‑driven optimization, and serverless databases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaMicroservicesSpring BootShardingSpheredatabase sharding
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.