Sharding-JDBC: A Lightweight Java Database Sharding Framework
Sharding-JDBC is a client‑side Java framework that provides transparent horizontal sharding for relational databases, offering flexible sharding strategies, SQL parsing and rewriting, multi‑threaded execution, result merging, and high performance while requiring no additional deployment or proxy layers.
Sharding-JDBC, developed by Dangdang's architecture team, is a lightweight Java library that implements transparent database sharding by extending the JDBC API, allowing applications to access sharded databases without proxy servers or extra deployment.
Use Cases : It addresses two common internet scenarios—large data volumes and high concurrency—through vertical (splitting tables by business relevance) and horizontal (splitting by sharding algorithms such as modulo) partitioning, often combining both to mitigate performance degradation.
Key Features include compatibility with any Java ORM (JPA, Hibernate, MyBatis, Spring JDBC Template), support for various connection pools (DBCP, C3P0, Druid, etc.), and the ability to work with any JDBC‑compliant database (currently MySQL, with Oracle and SQL Server planned).
Sharding-JDBC operates as a client‑side jar, requiring no proxy, and supports multi‑dimensional sharding keys, in/between operators, and complex SQL constructs such as joins, aggregations, ordering, limits, and OR conditions. It does not yet support UNION, sub‑queries, or function‑level sharding.
Architecture : The framework follows a modular pipeline—sharding rule configuration, SQL parsing (using Druid), SQL rewriting, routing, execution, and result merging. The overall architecture diagram (Figure 1) illustrates these stages.
Sharding Rule Configuration is highly flexible, allowing custom strategies, multiple sharding keys, and composite operators. It supports both equal‑value and range (IN/BETWEEN) sharding.
JDBC Rewrite wraps the core JDBC interfaces (DataSource, Connection, Statement, PreparedStatement, ResultSet) to manage multiple physical connections while preserving most JDBC functionality, though some newer JDBC 4.1 features are not yet implemented.
SQL Parsing leverages Druid, offering parsing speeds dozens of times faster than alternatives and supporting joins, aggregations, order‑by, group‑by, limit, and OR queries.
SQL Rewrite Examples include transforming distributed AVG calculations into SUM/COUNT for correct aggregation and adjusting pagination limits to retrieve sufficient rows before final merging.
SQL Routing directs queries to the appropriate shards based on the configured rules, handling single‑table, binding‑table, and Cartesian‑product routing scenarios.
Execution runs routed queries concurrently across shards, handling batch operations like addBatch.
Result Merging consolidates results for four categories: simple iteration, sorting (using merge‑sort), aggregation (max/min, sum/count, avg via rewrite), and grouping (using map‑reduce, which is memory‑intensive).
Performance : In single‑shard tests, Sharding-JDBC achieves 99.8% of JDBC query TPS, 90.2% of insert TPS, and 93.1% of update TPS. In multi‑shard scenarios, TPS improvements of ~94% (query), ~60% (insert), and ~89% (update) are observed.
Roadmap includes read/write splitting, flexible distributed transactions, distributed primary‑key generation, SQL rewrite optimizations, SQL hints, small‑table broadcasting, HA, flow control, schema tools, data migration, advanced SQL parsing (sub‑queries, stored procedures), Oracle/SQLServer support, and configuration center.
Open‑Source Philosophy emphasizes simultaneous internal and community support, complete source snapshots on GitHub, and encouraging community contributions to keep the project simple, well‑tested (over 90% coverage), and sustainable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
