Databases 17 min read

Understanding Database Sharding and Partitioning with ShardingSphere

This article explains the fundamentals of database sharding and partitioning, why they are needed for massive data volumes, various vertical and horizontal splitting strategies, routing algorithms, client vs. proxy deployment models, and the practical challenges and solutions when using ShardingSphere 5.x.

Architect
Architect
Architect
Understanding Database Sharding and Partitioning with ShardingSphere

Sharding, also known as database partitioning, is a technique used to split large tables and databases into smaller, more manageable pieces to improve performance and scalability when data volumes become massive.

The article first defines sharding as the combination of 分库 (database splitting) and 分表 (table splitting), which are often performed together, and explains that the main goal is to keep each individual database or table fast and responsive.

Reasons for sharding include limited storage capacity, connection limits, and the inability of a single MySQL instance to handle billions of rows efficiently; once data reaches tens of millions of rows, traditional indexing and optimization no longer suffice.

Typical scenarios for sharding are illustrated with a fictional site (fire100.top) where daily uploads grow from dozens to hundreds of thousands, eventually causing query latency and prompting a move to sharding.

Several sharding strategies are described:

Vertical splitting : separate databases by business domain (order, payment, user, etc.) and optionally split wide tables into narrower ones.

Horizontal splitting : distribute rows of the same table across multiple databases or tables using algorithms such as modulo, range, range‑plus‑modulo, geographic, or predefined mappings.

Examples of routing algorithms include:

show variables like '%max_connections%'

and the modulo rule hash(order_no) mod N to decide which database a particular order belongs to.

The article also compares client mode, where the application directly executes sharded SQL against multiple databases, with proxy mode, where a proxy service intercepts MySQL traffic, performs routing, and returns results, discussing trade‑offs in performance, complexity, upgrade impact, and observability.

Common challenges after sharding are highlighted, such as pagination, sorting, cross‑node joins, distributed transactions (using Seata or XA), generating globally unique IDs, managing thousands of physical tables, and migrating historical data.

Finally, the author notes that the series will dive deeper into ShardingSphere 5.x features, source code analysis, and practical demos, and invites readers to follow the public account for more technical shares.

distributed architectureMySQLShardingSpheredatabase shardingpartitioningHorizontal Splittingvertical splitting
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.