Databases 6 min read

Comprehensive Guide to Database Sharding (分库分表) with MyCAT

This article provides a detailed tutorial on database sharding (分库分表), covering concepts, common sharding strategies, a step‑by‑step MyCAT configuration, Java JDBC implementation, and key considerations such as data distribution, complexity, transaction handling, and development cost.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Comprehensive Guide to Database Sharding (分库分表) with MyCAT

Mike Chen, an experienced architect, introduces database sharding (分库分表) as a horizontal scaling technique to overcome performance and capacity limits of a single database.

Sharding splits a logical database into multiple physical databases (分库) and a large table into several smaller tables (分表). Common sharding strategies include range‑based, hash‑based, and list‑based partitioning.

Example 1: Range sharding divides data according to a field range, such as user ID intervals.

Example 2: Hash sharding distributes rows based on the hash value of a field, e.g., user ID.

Example 3: List sharding assigns rows to databases according to a predefined list, such as city names.

Practical implementation using MyCAT:

1. Create a user table:

CREATE TABLE `user` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `name` VARCHAR(50) NOT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB;

2. Configure sharding rules in MyCAT's XML configuration, defining data nodes (dn1, dn2) and a table rule (user_rule) that uses the mod‑long algorithm on the id column.

id
mod-long

3. Java code to connect to MyCAT, insert a user, and query all users:

import java.sql.*;

public class MyCatShardingExample {
    public static void main(String[] args) {
        try {
            // Connect to MyCAT
            String url = "jdbc:mysql://localhost:8066/sharding_db";
            String username = "mycat_user";
            String password = "mycat_password";
            Connection conn = DriverManager.getConnection(url, username, password);

            // Insert data
            String sql = "INSERT INTO user (name) VALUES (?)";
            PreparedStatement statement = conn.prepareStatement(sql);
            statement.setString(1, "John");
            statement.executeUpdate();

            // Query data
            String querySql = "SELECT * FROM user";
            Statement queryStatement = conn.createStatement();
            ResultSet resultSet = queryStatement.executeQuery(querySql);
            while (resultSet.next()) {
                int id = resultSet.getInt("id");
                String name = resultSet.getString("name");
                System.out.println("User ID: " + id + ", Name: " + name);
            }

            // Close connection
            conn.close();
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }
}

Key considerations after sharding include choosing an appropriate data distribution strategy to avoid skew, handling increased operational complexity, managing distributed transactions, and accounting for higher development effort, often requiring middleware such as ShardingSphere or MyCAT.

The article also promotes additional resources, including a 300,000‑word architecture collection and a comprehensive Java interview question set, with links for readers to obtain them.

distributed systemsJavaSQLDatabaseShardingMycat
Mike Chen's Internet Architecture
Written by

Mike Chen's Internet Architecture

Over ten years of BAT architecture experience, shared generously!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.