Databases 18 min read

Unlocking Distributed Databases with Apache ShardingSphere: Features and JD.com Case Study

This article introduces Apache ShardingSphere’s ecosystem, core sharding and governance capabilities, access endpoints such as Sharding-JDBC, Sharding-Proxy and Sharding-Sidecar, and details JD.com’s real‑world implementation, including data‑sharding strategies, distributed primary keys, hint routing, performance optimizations, and future roadmap.

dbaplus Community
dbaplus Community
dbaplus Community
Unlocking Distributed Databases with Apache ShardingSphere: Features and JD.com Case Study

Introduction

Apache ShardingSphere is an open‑source distributed database middleware ecosystem that provides data‑horizontal and vertical partitioning, distributed transactions, database governance, and security for internet‑scale and cloud services. It originated in 2016, entered the Apache Incubator in 2018, and is now driven by a broad community.

ShardingSphere Ecosystem Overview

The ecosystem consists of three main modules: data sharding (including read/write splitting), distributed transaction management, and database governance (configuration, topology, high‑availability, data masking, and permission control). It supports MySQL, Oracle, PostgreSQL, and SQL Server, abstracting the underlying database choice.

Core Access Endpoints

ShardingSphere provides three primary access endpoints:

Sharding‑JDBC : a lightweight Java framework that operates at the JDBC layer and is fully compatible with existing JDBC drivers and ORM frameworks.

Sharding‑Proxy : a server‑side implementation of the MySQL binary protocol, allowing applications to connect as if to a MySQL instance.

Sharding‑Sidecar (planned): a cloud‑native sidecar for Kubernetes/Mesos, acting as a database mesh.

Data Sharding Concepts

ShardingSphere distinguishes read/write splitting (primary‑replica) from horizontal sharding (logical table split into multiple physical tables across nodes). Systems often combine both to balance performance and security.

Distributed Primary Key Generation

To avoid key collisions after sharding, ShardingSphere provides built‑in generators such as UUID and Snowflake, configurable with a single line. Custom generators can be implemented via the SPI.

Hint Routing (Business Sharding Key Injection)

When a SQL statement lacks a sharding key, developers can set sharding values in HintManager (ThreadLocal) or embed special comments in SQL to force routing to specific data nodes.

Performance Optimizations

Key optimizations for high‑throughput scenarios include:

SQL parsing result cache

JDBC metadata cache

Binding tables and broadcast tables

Automated execution engine with streaming merge

Binding tables share the same sharding rule, allowing the engine to route joins without Cartesian products. Example without binding tables generates four routed SQL statements; with binding it reduces to two.

SELECT i.* FROM t_order o JOIN t_order_item i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

Broadcast tables exist in every shard, eliminating cross‑shard joins for small reference data such as dictionaries.

SQL Compatibility

ShardingSphere supports full routing for single‑node queries and comprehensive support for DQL, DML, DDL, DCL, TCL, and MySQL‑specific DAL in multi‑node scenarios, including pagination, distinct, sorting, grouping, aggregation, and joins (excluding cross‑database joins). Detailed compatibility information is available at:

https://shardingsphere.apache.org/document/current/cn/features/sharding/use-norms/sql/

References

Official site: https://shardingsphere.apache.org/

GitHub repository: https://github.com/apache/incubator-shardingsphere

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Optimizationdistributed databasedata shardingDatabase MiddlewareJD.comApache ShardingSphere
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.