Databases 10 min read

Practical MySQL Scaling and Sharding Strategies at 58.com under Big Data Loads

This article presents 58.com’s experience with MySQL under massive data volumes, covering core concepts such as single‑instance, sharding, replication and grouping, common availability and read‑write challenges, detailed sharding implementations for user, post, friend and order tables, and post‑sharding business practices including IN queries, non‑partition key queries, and cross‑database pagination.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Practical MySQL Scaling and Sharding Strategies at 58.com under Big Data Loads

Overview – At the WOT 2015 conference, 58.com shared practical insights on operating MySQL at massive scale, focusing on concepts, problems, sharding techniques, and post‑sharding business scenarios.

1. Basic Concepts

Key terms include:

Single‑instance (single database)

Sharding (horizontal partitioning) for scalability

Replication and grouping for high availability

Combined sharding + grouping as the typical architecture for large‑scale MySQL deployments

2. Common Problems and Solution Ideas

Challenges:

Ensuring availability

Handling diverse read/write ratios

Seamless schema changes, data migration, and capacity expansion

Managing huge data volumes

Solutions include:

Replication (master‑slave, dual‑master) for availability

Read‑write separation, indexing, caching, or horizontal splitting based on workload patterns

Log‑based migration (write‑log → data copy → verification → cut‑over)

Sharding (splitting databases) for massive data sets

3. Sharding Practice at 58.com

Four typical scenarios covering 99% of sharding cases:

User table (single‑key) : split by uid Post table (one‑to‑many) : split by uid, embed shard identifier in tid Friend table (many‑to‑many) : use data redundancy with multiple sharding strategies

Order table (multi‑key) : two approaches – combined scheme or limited multi‑shard queries for the 1% of requests

Examples of SQL used:

SELECT * FROM tiezi WHERE tid=$tid;
SELECT * FROM tiezi WHERE uid=$uid;
SELECT friend_uid FROM friend WHERE uid=$my_uid;
SELECT uid FROM friend WHERE friend_uid=$my_uid;
SELECT * FROM order WHERE oid=$oid;
SELECT * FROM order WHERE buyer_id=$my_uid;
SELECT * FROM order WHERE seller_id=$my_uid;

4. Business Practices After Sharding

Issues arise because some MySQL features no longer work across shards. Topics covered:

Complex SQL (joins, sub‑queries, triggers, UDFs) are discouraged due to performance impact

IN‑queries on shard keys: either dispatch to each shard (Map‑Reduce style) or rewrite into multiple SQL statements per shard

Non‑partition‑key queries: either route to a single shard when possible or perform distributed processing with result aggregation

Cross‑shard pagination: strategies include

Single‑shard pagination using max(id) as a cursor

Distribute the LIMIT query to all shards, merge and sort a small result set, then return the required page

Introduce auxiliary IDs to reduce query volume

Business‑level constraints (disable deep pagination, allow fuzzy results)

Ultimate solution: rewrite ORDER BY + OFFSET + LIMIT into two‑phase queries, to be detailed at a later conference

5. Summary

Key takeaways:

Fundamental concepts: single instance, sharding, replication, grouping

Availability solved by redundancy; read‑write imbalance addressed with read‑only replicas, caching, or sharding

Seamless migration relies on log‑based or dual‑write approaches

Large data volumes are best handled by sharding, with four typical patterns for user, post, friend, and order data

Post‑sharding operations should avoid complex cross‑shard SQL, use distributed query techniques, and consider pagination optimizations

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

shardingmysqlReplicationdatabase scalingPartitioningCross‑Shard Queries
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.