Introduction to Sharding-JDBC and ShardingSphere: Architecture, Sharding Strategies, and Quick Integration
This article introduces Sharding-JDBC and ShardingSphere, explains their history, core architecture, vertical and horizontal sharding concepts, sharding rule configuration, SQL parsing and rewriting, routing and result merging, and provides a quick‑start guide with Maven dependencies and Spring Boot configuration for Java applications.
1. Introduction
Sharding-JDBC originated from the ddframe framework of Dangdang and was later renamed ShardingSphere when it was incubated by the Apache Foundation. It is a lightweight Java framework that provides transparent horizontal database sharding, distributed transaction, and governance capabilities.
2. ShardingSphere Overview
ShardingSphere is an open‑source ecosystem composed of three independent products: Sharding-JDBC, Sharding‑Proxy, and the planned Sharding‑Sidecar. All products support standardized data sharding, distributed transactions, and database governance, and can be used in Java‑centric, heterogeneous language, container, or cloud‑native environments.
3. Sharding-JDBC Overview
Sharding-JDBC acts as an enhanced JDBC driver that runs on the client side, requiring only a JAR package without additional services. It works with any Java ORM framework (JPA, Hibernate, MyBatis, Spring JDBC Template) and any connection pool (DBCP, C3P0, Druid, HikariCP). It supports MySQL, Oracle, SQLServer, and PostgreSQL.
4. Database and Table Partitioning
Partitioning addresses two common scenarios in large‑scale internet applications: massive data volume and high concurrency.
Vertical Partitioning (Database Splitting)
When a single database contains tables for multiple business domains, vertical splitting separates them into multiple databases to reduce I/O bottlenecks.
Advantages: clearer business boundaries, easier subsystem integration, simpler data management.
Disadvantages: increased system complexity, cross‑database transaction consistency issues, and limited benefit for a single high‑traffic business.
Horizontal Partitioning (Table Splitting)
Horizontal splitting divides a large table (e.g., orders) into multiple tables based on a sharding key, improving I/O performance and system stability.
Advantages: keeps each table size manageable, improves query efficiency and system load capacity.
Disadvantages: cross‑database joins become complex, data consistency is harder to guarantee, and migration/expansion requires significant effort.
5. Core Principles
Key concepts include real databases (e.g., ordercenter_0‑7), real tables (e.g., orders_0‑1023), data nodes (ordercenter_0.orders_0), and logical tables (e.g., orders).
6. Architecture Diagram
7. Sharding Rule Configuration
Sharding-JDBC supports custom sharding strategies, multiple sharding keys, and various operators (equality, IN, BETWEEN). It can combine database‑level and table‑level sharding, such as sharding by user ID or by year/month.
8. JDBC Specification Re‑implementation
The framework wraps the five core JDBC interfaces (DataSource, Connection, Statement, PreparedStatement, ResultSet) and manages multiple underlying JDBC drivers.
9. SQL Parsing
SQL parsing is critical for performance and compatibility. Sharding-JDBC uses Druid as its parser, which is dozens of times faster than alternatives like fdb or jsqlparser.
select id, name from t_user where status = 'active' and age > 18;10. SQL Rewrite
SQL rewrite replaces logical table names with real table names and adjusts statements that are incorrect in a sharding environment (e.g., average calculations and pagination).
select score from t_score order by score desc limit 1, 2;11. SQL Routing
Routing types include standard routing, direct routing, and Cartesian‑product routing. For bound tables, standard routing generates a few SQL statements; for unbound tables, Cartesian routing may produce thousands of statements.
select o.* FROM orders o left join order_goods g on o.id = g.order_id where o.user_id in (1, 2);12. Result Merging
Four merging categories exist: simple iteration, sorting (using merge‑sort), aggregation (max/min, sum/count, average), and grouping (map‑reduce style, most memory‑intensive).
13. Standard Sharding Algorithms
Images illustrate the default database and table sharding algorithms (modulo‑based expressions).
14. Quick Start Example
Add the following Maven dependencies:
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-boot-starter</artifactId>
<version>4.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.shardingsphere</groupId>
<artifactId>sharding-jdbc-spring-namespace</artifactId>
<version>4.1.1</version>
</dependency>Configure the data source and sharding rules in application.yml (or application.properties ) as shown in the original article, specifying the master datasource, connection pool settings, and sharding table definitions.
After initializing the database tables, the project can be started and the sharding features explored.
15. Official Documentation
For more details, visit the ShardingSphere documentation: https://shardingsphere.apache.org/index_zh.html
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.