Databases 39 min read

Unlock Sharding-JDBC: Guide to Sharding, Replication & Distributed Databases

This comprehensive tutorial walks through the evolution of Sharding-JDBC into ShardingSphere, explains core concepts like sharding, data nodes, and algorithms, demonstrates MySQL master‑slave setup with Docker, and provides step‑by‑step code examples for implementing read‑write separation, horizontal and vertical sharding, and distributed primary keys.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
Unlock Sharding-JDBC: Guide to Sharding, Replication & Distributed Databases

Preface

Sharding-JDBC

was originally an internal sharding framework at Dangdang and open‑sourced in 2017. It later evolved into ShardingSphere, becoming an Apache top‑level project in April 2020. ShardingSphere-Jdbc is a lightweight Java framework that provides extra services at the JDBC layer. It works as an enhanced JDBC driver, fully compatible with JDBC and various ORM frameworks.

Core Concepts

Sharding (分库分表)

Sharding splits a database into multiple databases (horizontal) and tables (vertical) deployed on different machines.

Shard (分片)

Data is partitioned using a horizontal sharding mode, creating multiple physical tables (e.g., t_order_0, t_order_1, …). Routing rules based on 分库策略 and 分片策略 determine which database and table store a given row.

Data Node

A data node is the smallest indivisible unit in sharding, consisting of a data source name and a physical table (e.g., order_db_1.t_order_0).

Logical Table

A logical table represents a group of tables with the same structure. For example, t_order may be split into t_order_0t_order_9. Application code still queries t_order, while the framework routes to the actual physical tables.

Physical Table

The actual tables that exist in the database, such as t_order_0, t_order_1, etc.

Sharding Key

The column used for sharding. For t_order, order_id determines the target database and table via modulo operation.

Sharding Algorithms

Four built‑in algorithms are provided:

Precise Sharding Algorithm – for = and IN conditions.

Range Sharding Algorithm – for BETWEEN, >, <, >=, <= conditions.

Complex Keys Sharding Algorithm – for multiple sharding keys.

Hint Sharding Algorithm – forces routing without extracting a sharding key.

Sharding Strategies

Strategies combine algorithms and sharding keys.

Standard Sharding Strategy – supports single sharding key with precise and optional range algorithms.

Complex Sharding Strategy – supports multiple sharding keys.

Inline (Row Expression) Strategy – uses Groovy expressions, e.g., order_id % 4.

Hint Sharding Strategy – routing is specified manually.

Distributed Primary Key

ShardingSphere provides UUID and Snowflake generators; Snowflake is the default 64‑bit ID generator.

Broadcast Table

Tables that exist in every shard with identical data, typically used for dictionary or configuration data.

Binding Table

Tables that share the same sharding rule, enabling efficient join queries without cross‑shard cartesian products.

SELECT * FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id;

MySQL Master‑Slave Replication

Docker Configuration

Create Master Directories

mkdir -p /usr/local/mysqlData/master/cnf
mkdir -p /usr/local/mysqlData/master/data

Master my.cnf

[mysqld]
server-id=1
log-bin=mysql-bin
binlog_cache_size=1M
binlog_format=mixed

Start Master Container

docker run -itd -p 3306:3306 --name master \
  -v /usr/local/mysqlData/master/cnf:/etc/mysql/conf.d \
  -v /usr/local/mysqlData/master/data:/var/lib/mysql \
  -e MYSQL_ROOT_PASSWORD=123456 mysql:5.7

Create Replication User

GRANT REPLICATION SLAVE ON *.* TO 'reader'@'%' IDENTIFIED BY 'reader';
FLUSH PRIVILEGES;

Create Slave Directories and Config

mkdir -p /usr/local/mysqlData/slave/cnf
mkdir -p /usr/local/mysqlData/slave/data
[mysqld]
server-id=2
log-bin=mysql-slave-bin
relay_log=edu-mysql-relay-bin
log_bin_trust_function_creators=true
binlog_cache_size=1M
binlog_format=mixed
slave_skip_errors=1062

Start Slave Container

docker run -itd -p 3307:3306 --name slaver \
  -v /usr/local/mysqlData/slave/cnf:/etc/mysql/conf.d \
  -v /usr/local/mysqlData/slave/data:/var/lib/mysql \
  -e MYSQL_ROOT_PASSWORD=123456 mysql:5.7

Configure Slave

CHANGE MASTER TO MASTER_HOST='172.17.0.2',
MASTER_USER='reader', MASTER_PASSWORD='reader',
MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=591;
START SLAVE;

Binlog and Redo Log Review

Redo Log

InnoDB writes redo logs to a buffer and flushes them to disk periodically, on transaction commit, or when the buffer is half full.

Binlog

Binlog records all data‑changing statements (excluding SELECT/SHOW) and is used for replication and incremental recovery. Formats: STATEMENT, ROW, MIXED.

Two‑Phase Commit

CREATE TABLE T(ID int primary key, c int);
UPDATE T SET c=c+1 WHERE ID=2;

The executor fetches the row, modifies it, writes to redo log (prepare), generates binlog, then commits the redo log (commit).

Why Sharding?

When a single database cannot handle massive data (tens of millions of rows), sharding reduces disk usage, improves query performance, and avoids index height growth.

Sharding‑JDBC vs MyCat

MyCat intercepts SQL, performs analysis (sharding, routing, read‑write splitting, caching) and forwards to real databases. Sharding‑JDBC follows the pipeline: parse → optimize → route → rewrite → execute → merge.

Sharding‑JDBC Read‑Write Separation Example

Docker‑based MySQL master‑slave setup, Spring Boot project with dependencies, and configuration for master‑slave routing.

spring:
  shardingsphere:
    datasource:
      names: ds1,ds2,ds3
      ds1:
        driver-class-name: com.mysql.jdbc.Driver
        url: jdbc:mysql://192.168.56.111:3306/sharding-jdbc-db
        username: root
        password: 123456
      ds2:
        url: jdbc:mysql://192.168.56.111:3307/sharding-jdbc-db
    masterslave:
      master-data-source-name: ds1
      slave-data-source-names: ds2,ds3
      load-balance-algorithm-type: round_robin
    props:
      sql:
        show: true

Controller, Mapper, and Entity code demonstrate inserting users (writes go to master) and querying users (reads round‑robin across slaves).

Sharding‑JDBC Horizontal Sharding Example

Single‑database configuration with inline algorithm order_$->{order_id % 2 + 1} splits order table into order_1 and order_2 based on even/odd order_id.

spring:
  shardingsphere:
    datasource:
      names: db1
      db1:
        url: jdbc:mysql://127.0.0.1:3306/order
    sharding:
      tables:
        order:
          actual-data-nodes: db1.order_$->{1..2}
          table-strategy:
            inline:
              sharding-column: order_id
              algorithm-expression: order_$->{order_id % 2 + 1}
    props:
      sql:
        show: true

Sharding‑JDBC Vertical Sharding (Database) Example

Data is routed to order1 or order2 databases based on user_id % 2 + 1.

spring:
  shardingsphere:
    datasource:
      names: db1,db2
      db1:
        url: jdbc:mysql://localhost:3306/order1
      db2:
        url: jdbc:mysql://localhost:3306/order2
    sharding:
      tables:
        order_info:
          actual-data-nodes: db$->{1..2}.order_info
          database-strategy:
            inline:
              sharding-column: user_id
              algorithm-expression: db$->{user_id % 2 + 1}
    props:
      sql:
        show: true

Sharding‑JDBC Execution Flow

SQL → Parse → Route → Rewrite → Execute → Merge. The inline algorithm order_$->{order_id % 2 + 1} routes even order_id to order_1 and odd to order_2.

Sharding Strategies Overview

Standard: single‑key sharding with precise and optional range algorithms.

Complex: supports multiple sharding keys.

Inline: simple Groovy expression, single‑key only.

Hint: routing specified manually.

None: no sharding.

Official documentation: https://shardingsphere.apache.org/document/current/en/overview/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Spring BootMySQL replicationSharding-JDBC
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.