Mastering Database Sharding: When and How to Split Databases and Tables
This article explains why and how to split databases and tables in high‑concurrency systems, covering vertical (business‑domain) and horizontal (data‑volume) sharding, practical architectures, read‑write separation, routing algorithms, and real‑world case studies from gaming, membership points, and restaurant ordering platforms.
Preface
In high‑concurrency systems, database sharding (splitting databases and tables) is essential and a common interview topic at large companies.
Why do we need sharding? It can be considered from two dimensions: vertical and horizontal.
1 Vertical Direction
Vertical sharding focuses on business domains.
1.1 Single Database
In the early stage, business is simple and modules are few, so a single database with multiple tables is used to speed iteration and reduce complexity.
Initial architecture:
1.2 Table Splitting
As features grow, a single table becomes large and hard to maintain. Splitting the user table into a basic info table and an extension table separates core and non‑core data, improves clarity, and aligns with access frequency.
Core user information table stores username, password, phone, email, age, gender, etc., which are queried frequently.
Extension table stores less frequently needed data such as organization, location, etc.
1.3 Database Splitting
After a year of development, the system becomes complex. Tables are grouped by domain into separate databases (user, product, logistics, order).
For illustration, only one table per database is shown.
After domain splitting, each domain only concerns its own tables, making maintenance easier.
1.4 Database and Table Splitting
Sometimes only one of them is insufficient. For example, financial systems may need separate databases per month and year to store user funds.
2 Horizontal Direction
Horizontal sharding focuses on data volume.
2.1 Single Database
Initially, with few users, a single master database with multiple tables suffices.
2.2 Master‑Slave Read/Write Separation
As the user base grows, read requests dominate. To avoid exhausting database connections, read and write workloads are separated using a master‑slave architecture.
Initially one master and one slave are used; later the architecture can be expanded to one master with multiple slaves.
Further scaling to more slaves is possible.
2.3 Database Splitting
If write load becomes high (e.g., user registration), multiple user databases are created.
2.4 Table Splitting
When a single table reaches tens of millions of rows, performance degrades even with indexes. Splitting tables keeps each under about ten million rows, reducing query time and CPU usage.
2.5 Combined Database and Table Splitting
For large‑scale systems, both are applied; routing algorithms determine which database and table to use.
Modulo by ID (e.g., id % 4)
Range partition (e.g., 0‑100k in table 0, 100k‑200k in table 1)
Consistent hashing
3 Real Cases
3.1 Database Splitting
Game platform where each game gets its own database, allowing different schemas per game.
3.2 Table Splitting
Points system for a membership program uses a single points database with 128 tables, hashed by user ID.
Table count should be a power of two for easier expansion.
3.3 Combined Database and Table Splitting
Restaurant ordering system uses Sharding‑JDBC, four databases each with 32 tables.
4 Summary
Vertical sharding (by business) is simpler; horizontal sharding (by data) distinguishes between splitting databases (to solve connection and I/O limits) and splitting tables (to reduce row count and CPU usage). Choose the appropriate strategy based on concurrency and data volume.
Database splitting solves connection and I/O bottlenecks.
Table splitting solves large‑table query performance and CPU consumption.
Combined splitting solves both.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
