Core Functions of Relational Database Middleware and an In‑Depth Look at Sharding‑JDBC Architecture
This article explains why relational database middleware is essential for scaling internet‑level workloads, describes the principles of horizontal sharding and distributed primary‑key generation, and provides a comprehensive overview of Sharding‑JDBC’s architecture, core modules, performance benchmarks, and future roadmap.
Relational databases remain the primary choice for business storage due to flexible SQL queries and strong transactional guarantees, but a single instance cannot handle the massive data volume and traffic of modern internet applications; therefore, middleware that transparently transforms a monolithic database into a distributed one has become widely adopted.
Horizontal sharding (splitting tables by a sharding algorithm) is the standard solution to keep each table’s size below a threshold, while vertical sharding alone cannot keep up with rapid business changes; the combination of sharding and read/write separation alleviates both data‑size and access‑volume bottlenecks, though it introduces challenges such as cross‑database transactions and distributed primary‑key generation.
Sharding‑JDBC is an open‑source Java library that implements full sharding, read/write separation, and distributed primary‑key features with near‑zero migration cost. It works as a JDBC driver, supporting MySQL, PostgreSQL, Oracle, and SQL Server, and its core logic consists of sharding rule configuration, JDBC interface rewriting, SQL parsing, routing, rewriting, execution, and result merging.
Performance tests show that Sharding‑JDBC’s query throughput is 99.8% of native JDBC, while insert and update throughput are 90.2% and 93.1% respectively; when a single table is split into two shards, overall throughput can increase by 60‑94% depending on the operation, demonstrating the effectiveness of horizontal scaling.
The sharding rule configuration is highly flexible, supporting =, BETWEEN, IN operators, multi‑key strategies, and both programmatic and Inline expression definitions; JDBC rewriting adapts SQL to the actual physical tables, handling correctness (e.g., AVG → SUM/COUNT) and optimizations (e.g., eliminating unnecessary pagination for single‑route queries). SQL parsing uses a self‑developed engine that extracts only the sharding‑relevant context, while routing determines the target data sources via direct, simple, or Cartesian‑product strategies.
Execution is performed by a thread pool tied to the ShardingDataSource, and result merging handles traversal, sorting, grouping, and pagination through stream‑based, in‑memory, or decorator patterns, ensuring correct aggregation without loading all data into memory unless necessary.
Distributed primary‑key generation is centralized around the Snowflake algorithm, seamlessly integrated into the JDBC getGeneratedKeys flow, with support for both Statement and PreparedStatement usage.
Looking ahead, Sharding‑JDBC 1.6.x aims to add dynamic configuration and database governance via a registration center, enabling features such as database discovery, traffic steering, fault‑tolerance, and circuit‑breaker behavior, positioning the project as a micro‑service‑oriented OLTP sharding component.
The project’s evolution from version 1.0.x to 1.5.x demonstrates a focus on transparent distributed databases, and the source code is available at https://github.com/dangdangdotcom/sharding-jdbc for anyone interested in contributing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
