Databases 12 min read

Mastering MySQL Scaling: Sharding, Replication, and Read/Write Separation

This article explains how to scale MySQL for large‑scale, high‑concurrency applications by using business splitting, master‑slave replication, database sharding and partitioning, and provides practical implementation strategies, code examples, and references to useful tools and middleware.

Java Backend Technology

Jan 2, 2017

Mastering MySQL Scaling: Sharding, Replication, and Read/Write Separation

1. MySQL Extension Implementation Methods

As business scale grows, appropriate solutions are needed to handle increasing data volume and access pressure. Database scaling mainly includes business splitting, master‑slave replication, and database sharding/partitioning. This article focuses on sharding and partitioning.

(1) Business Splitting

Early stages often adopt a centralized architecture for rapid development. When the system expands, a single database becomes a bottleneck, leading to low development efficiency and high hardware costs. Splitting each business module (e.g., users, shops, comments, orders) into separate databases distributes load across multiple databases, improving overall throughput.

Further scaling requires each module to use its own database, turning one database dependency into multiple, thereby increasing system capacity.

(2) Master‑Slave Replication

Replication works by the slave reading the binary log (binlog) from the master and replaying the recorded operations locally. Because replication is asynchronous, there may be latency between master and slave, but eventual consistency is guaranteed.

The above diagram illustrates the data synchronization process between MySQL master and slave.

(3) Database Sharding and Partitioning

When a single machine reaches its physical limits, adding more machines distributes the load. Sharding (splitting tables) improves query performance for massive tables, while sharding databases (splitting across machines) alleviates I/O bottlenecks.

2. Partitioning Strategy

Keywords: user ID, table capacity

Most business data is related to a user ID, making it a natural routing key. By taking user_id % N (where N is the number of tables), data is evenly distributed across tables.

Example: an order table in an e‑commerce platform.

If we assume 100 tables, the routing logic is:

For user_id = 101, the target table is determined by 101 % 100 = 1, i.e., order_1.

select * from order_1 where user_id = 101

In practice, MyBatis provides built-in support for dynamic table names. The following images show a typical interface definition and XML mapper configuration where ${tableNum} injects the computed table number.

If user IDs are UUIDs, they should first be hashed to an integer before applying the modulo operation.

3. Sharding (Database Splitting) Strategy

Sharding databases follows the same modulo principle as table partitioning. Using the user ID as the routing key, the system determines which database instance will store the data.

Keywords: user ID, database capacity

When the user ID is a UUID, hash it first and then apply the modulo.

4. Combined Sharding and Partitioning Strategy

For large systems, both sharding and partitioning are needed. A common routing algorithm is:

1) intermediate = user_id % (number_of_databases * tables_per_database); 2) database_index = floor(intermediate / tables_per_database); 3) table_index = intermediate % tables_per_database;

Example: 256 databases, each with 1024 tables, user_id = 262145.

1) intermediate = 262145 % (256*1024) = 1; 2) database_index = floor(1 / 1024) = 0; 3) table_index = 1 % 1024 = 1;

The user data is routed to database 0, table 1.

5. Summary of Sharding and Partitioning

Various strategies exist: simple modulo by user ID, range partitioning, hash routing, etc. Hash routing offers uniform data distribution but complicates data migration.

After applying sharding and partitioning, query performance and concurrency improve, but new challenges arise such as distributed transactions, difficulty of cross‑table joins, and the need for explicit routing fields.

Middleware like Alibaba's Cobar can help manage sharding; its GitHub repository is https://github.com/alibaba/cobar and documentation at https://github.com/alibaba/cobar/wiki.

6. Conclusion

While sharding, replication, and read/write separation are fundamental techniques for building scalable, high‑performance websites, they are only part of the whole picture. Additional topics such as clustering, load balancing, disaster recovery, automatic failover, and transaction management also need to be mastered.

"The road ahead is long, but we will keep seeking knowledge."

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

sharding mysql MyBatis read/write splitting database scaling

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.