Database Sharding and Partitioning Strategies with Practical Implementation
This article explains why high‑traffic systems need database sharding, compares table‑level and database‑level partitioning methods, presents concrete MySQL table‑creation scripts and Java service code, and discusses the advantages, drawbacks, and common middleware solutions such as Sharding‑JDBC and Mycat.
Overview
High concurrency and massive data growth require continuous system upgrades; partitioning tables (sharding) is a key technique to handle big data and high QPS scenarios.
Why Partition Tables
When daily active users reach tens of millions and a single table accumulates millions of rows, query performance degrades and storage capacity becomes a bottleneck, making single‑database solutions untenable.
Table Partitioning
3.1 Partitioning Scheme
When a table reaches tens of millions of rows, its size severely impacts SQL execution performance, prompting the need to split the table.
Common approach: use ID modulo to distribute rows across multiple tables (e.g., customer0‑customer3) based on the remainder of cusno % 4 .
create table **customer0**(
id int unsigned primary key,
name varchar(32) not null default '',
pwd varchar(32) not null default ''
)engine=myisam charset utf8;
create table **customer1**(
id int unsigned primary key,
name varchar(32) not null default '',
pwd varchar(32) not null default ''
)engine=myisam charset utf8;
create table **customer2**(
id int unsigned primary key,
name varchar(32) not null default '',
pwd varchar(32) not null default ''
)engine=myisam charset utf8;
create table **customer3**(
id int unsigned primary key,
name varchar(32) not null default '',
pwd varchar(32) not null default ''
)engine=myisam charset utf8;
create table **uuid**(
id int unsigned primary key auto_increment
)engine=myisam charset utf8;Java service demonstrates how to generate a unique ID, determine the target table, and insert or query data:
@Service
public class CustomerService {
@Autowired
private JdbcTemplate jdbcTemplate;
/**
* Register a user
*/
public String register(String name, String pwd) {
String insertUUidSql = "insert into uuid values(null)";
jdbcTemplate.update(insertUUidSql);
Long cusno = jdbcTemplate.queryForObject("select last_insert_id()", Long.class);
String tableName = "customer" + cusno % 3;
String insertUserSql = "INSERT INTO " + tableName + " VALUES ('" + cusno + "','" + name + "','" + pwd + "');";
System.out.println("insertUserSql:" + insertUserSql);
jdbcTemplate.update(insertUserSql);
return "success";
}
/**
* Get name by cusno
*/
public String get(Long cusno) {
String tableName = "customer" + cusno % 3;
String sql = "select name from " + tableName + " where id=" + cusno;
System.out.println("SQL:" + sql);
return jdbcTemplate.queryForObject(sql, String.class);
}
}Pros and Cons of Table Partitioning
Advantages: Data is evenly distributed across tables, reducing pressure on a single DB during high concurrency.
Disadvantages: Data migration is required when scaling, and cross‑table queries become more complex.
Database Partitioning (Sharding)
When a single database can only handle ~2000 QPS, horizontal scaling by splitting data across multiple databases becomes necessary.
Vertical Splitting: Separate tables with low coupling into different databases, often aligned with micro‑service boundaries.
Horizontal Splitting: Distribute rows of a large table across multiple databases based on range or modulo, enabling scale‑out.
Typical strategies include:
Range‑based sharding (e.g., user IDs 1‑9999 go to DB1, 10000‑19999 to DB2).
Modulo‑based sharding (e.g., userId % N).
Challenges: cross‑database transactions, join queries, data routing, and assembly.
Solutions
Open‑source middleware such as Sharding‑JDBC (client‑side) and Mycat (proxy‑side) are commonly used.
Sharding‑JDBC: No deployment needed, low operational cost, high performance, but introduces tight coupling to the client.
Mycat: Requires a dedicated proxy deployment, higher operational overhead, but provides transparent sharding for all services.
Recommendation: Small‑to‑medium companies may prefer Sharding‑JDBC for simplicity, while large enterprises often choose Mycat for better isolation and manageability.
Additional Topics
Load balancing via Nginx, primary‑key generation strategies, and read‑write separation are also discussed as complementary techniques for handling massive traffic.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.