Databases 12 min read

Why and How to Implement Database Sharding: Strategies, Middleware, and Best Practices

Database sharding—splitting data across multiple databases and tables—is essential as applications scale, and this article explains the motivations, differences between sharding and partitioning, horizontal and vertical splits, common middleware options like Cobar, TDDL, Atlas, ShardingSphere, and Mycat, and how to choose the right solution.

21CTO

Mar 12, 2023

Why and How to Implement Database Sharding: Strategies, Middleware, and Best Practices

Why Do We Need Database Sharding?

Sharding (splitting databases) and table partitioning are two distinct concepts; they should not be confused. A system may split only databases, only tables, or both.

Consider a startup that begins with 200,000 registered users, 10,000 daily active users, and 1,000 new rows per day. After rapid growth, the user base can reach tens of millions, daily active users in the millions, and peak concurrency of thousands of requests per second, quickly exhausting a single database’s capacity.

When a single table reaches tens of millions of rows and concurrent requests hit several thousand, the database’s disk usage and query performance degrade sharply.

Table Partitioning

When a table grows to tens of millions of rows, performance suffers; splitting the table is necessary. Table partitioning means storing a table’s data across multiple tables, often by user ID, keeping each table’s row count within a manageable range (e.g., under 2 million rows).

Database Partitioning

One database typically handles up to about 2,000 concurrent requests; to stay healthy, keep per‑database concurrency around 1,000. Splitting data across multiple databases distributes load and storage.

Thus, sharding (both database and table splitting) becomes essential as business scales.

Before Sharding

After Sharding

Concurrency Support

MySQL single‑instance cannot handle high concurrency

MySQL scaled to multiple instances, supporting many times higher concurrency

Disk Usage

Single‑instance disk nearly full

Multiple databases reduce disk usage per server

SQL Performance

Large single table makes SQL slow

Reduced table size improves SQL execution speed

Which Sharding Middleware Have You Used?

Common middleware options include:

Cobar

TDDL

Atlas

Sharding‑jdbc (now ShardingSphere)

Mycat

Cobar

Developed by Alibaba’s B2B team, a proxy‑layer solution that parses SQL and routes it to appropriate MySQL instances. It is now rarely used and lacks support for read/write splitting, stored procedures, cross‑database joins, and pagination.

TDDL

Developed by the Taobao team, a client‑layer solution supporting basic CRUD and read/write splitting, but not joins or multi‑table queries, and depends on Alibaba’s Diamond configuration system.

Atlas

360’s open‑source proxy solution, once used by some companies but no longer actively maintained.

Sharding‑jdbc / ShardingSphere

Originally an open‑source client‑layer solution from Dangdang, now renamed ShardingSphere. It supports sharding, read/write splitting, distributed ID generation, and flexible transactions. The latest version (4.0.0‑RC1) is actively maintained and widely adopted.

Mycat

Based on Cobar, a proxy‑layer solution with comprehensive features. It is popular and actively maintained, though younger than ShardingSphere.

Summary

For most cases, Sharding‑jdbc and Mycat are the recommended choices. Sharding‑jdbc’s client‑layer approach offers low operational cost and high performance but introduces coupling when upgrading. Mycat’s proxy‑layer approach requires deployment and higher maintenance effort but provides transparency to applications.

Generally, small‑to‑medium companies benefit from Sharding‑jdbc, while larger enterprises may prefer Mycat for its scalability and team‑level support.

How to Split a Database?

Horizontal Partitioning distributes rows of a table across multiple databases/tables with identical schemas, balancing data and load.

Vertical Partitioning splits a wide table into multiple tables or databases, each containing a subset of columns, placing frequently accessed columns in a separate table to improve cache efficiency.

Common splitting strategies include:

Range‑based splitting (e.g., by time range) – simple to expand but may cause hotspot traffic on recent data.

Hash‑based splitting – evenly distributes load but requires data migration when scaling.

Middleware can automatically route queries to the appropriate database and table based on a chosen sharding key (e.g., user ID).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

scalability Middleware MySQL database sharding Horizontal Partitioning

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.