Understanding Database Sharding: Vertical and Horizontal Partitioning Strategies
This article explains why and how to apply vertical (business‑oriented) and horizontal (data‑oriented) sharding, covering single‑database, table splitting, database splitting, combined sharding, read‑write separation, routing algorithms, and real‑world case studies to improve scalability and performance.
In high‑concurrency systems, sharding (splitting databases and tables) is an essential technique and a common interview topic for large tech companies.
1 Vertical Direction
The vertical direction focuses on business separation.
1.1 Single Database
At the early stage, the system is simple, so a single database with multiple business tables is used to reduce dependencies and speed up development.
Architecture diagram:
1.2 Table Splitting
As the system grows, tables become wide and hard to maintain. Splitting a user table into a user_basic table (core fields) and a user_extension table (non‑core fields) improves clarity and query efficiency.
1.3 Database Splitting
After many iterations, the system becomes complex. By grouping related tables into separate databases (e.g., user, product, logistics, order), each domain can focus on its own tables, reducing coupling.
1.4 Combined Database and Table Splitting
Some scenarios (e.g., financial systems) require both vertical and horizontal partitioning, such as creating a separate database per year with twelve monthly tables for user funds.
2 Horizontal Direction
The horizontal direction targets data distribution.
2.1 Single Database (Master)
When user count is low, a single master database containing many business tables handles both reads and writes.
2.2 Master‑Slave Read/Write Separation
As traffic grows, read requests dominate. Separating read replicas from the write master (one‑master‑one‑slave or one‑master‑multiple‑slaves) alleviates connection pressure and improves availability.
2.3 Database Splitting
If a single domain (e.g., user) experiences massive write traffic, multiple identical user databases can be created, each storing a disjoint subset of users.
2.4 Table Splitting
When a single table exceeds ~10 million rows, splitting it into multiple tables reduces index size and query latency.
2.5 Combined Database and Table Splitting
For very large scale, both dimensions are applied. Users are routed to one of several databases, then to one of several tables within that database using routing algorithms such as modulo, range partitioning, or consistent hashing.
Modulo: id % 4 routes to one of four tables.
Range: specific ID intervals map to specific tables.
Consistent hash: distributes keys evenly across nodes.
3 Real‑World Cases
3.1 Database Splitting for Game Platforms
Each game vendor gets its own database because login traffic is high and schema may differ.
3.2 Table Splitting for a Points‑Based Membership System
Points transactions generate massive rows; a single database with 128 tables is used, routing by userId % 128 .
3.3 Combined Sharding for a Restaurant Ordering System
During peak meal times, both concurrency and data volume are high; the solution uses sharding‑jdbc with four databases, each containing 32 tables.
4 Summary
Vertical sharding (by business domain) is simpler, while horizontal sharding (by data) addresses different performance bottlenecks. Database splitting solves connection and I/O limits; table splitting reduces large‑table query cost; combined sharding tackles both.
Choose the appropriate strategy based on whether read/write concurrency, data volume, or both are the primary challenge.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.