Mastering MySQL Sharding: Strategies for 1 Billion Orders
This article explores the pain points of a 700‑million‑row MySQL order table, presents vertical and horizontal sharding strategies, introduces gene‑based Snowflake IDs, details routing logic, migration steps, common pitfalls, and shows performance gains after applying the final architecture.
Introduction
When a MySQL order table reaches 700 million rows, the platform experiences fatal issues such as 12‑second simple queries and 278‑second full‑table counts.
Core Contradictions
B+Tree index depth reaches five levels, causing a surge in disk I/O.
Single table size exceeds 200 GB, making backup windows longer than six hours.
Write concurrency hits 8,000 QPS, with master‑slave replication lag up to 15 minutes.
Key Insight: When a single table exceeds 50 million rows, a sharding plan should be initiated.
Given a scenario with one billion orders, how should we shard the database?
1. Sharding Core Strategies
1.1 Vertical Split – Reduce Data First
Optimization Effects:
Core table size reduced by 60 %.
High‑frequency query fields become cache‑friendly, improving hit rate.
1.2 Horizontal Split – The Ultimate Solution
Three Principles for Sharding Key Selection :
Discreteness : Avoid hot spots (e.g., user_id is better than status).
Business Relevance : 80 % of queries must include the key.
Stability : The value should not change with business logic (avoid using phone numbers).
Sharding Strategy Comparison :
Strategy Type | Applicable Scenario | Expansion Complexity
----------------------------------------------------------
Range Sharding | Queries with time range | Simple
Hash Modulo | Uniform distribution | Difficult
Consistent Hash| Dynamic scaling | Medium
Gene Sharding | Avoid cross‑shard queries| Complex2. Gene Sharding
High‑frequency queries in the order system:
User queries historical orders ( user_id).
Merchant queries orders ( merchant_id).
Customer service queries by order number ( order_no).
Solution:
Snowflake Order ID Generation :
public class OrderIdGenerator {
// 64‑bit ID: sign(1) + timestamp(41) + gene(12) + sequence(10)
private static final int GENE_BITS = 12;
public static long generateId(long userId) {
long timestamp = System.currentTimeMillis() - 1288834974657L;
// Extract lower 12 bits of userId as gene
long gene = userId & ((1L << GENE_BITS) - 1);
long sequence = ...; // obtain sequence
return (timestamp << 22) | (gene << 10) | sequence;
}
public static int getShardKey(long orderId) {
return (int) ((orderId >> 10) & 0xFFF); // extract middle 12 bits
}
}Routing Logic :
public class OrderShardingRouter {
// 8 databases, each with 16 tables
private static final int DB_COUNT = 8;
private static final int TABLE_COUNT_PER_DB = 16;
public static String route(long orderId) {
int gene = OrderIdGenerator.getShardKey(orderId);
int dbIndex = gene % DB_COUNT;
int tableIndex = gene % TABLE_COUNT_PER_DB;
return "order_db_" + dbIndex + ".orders_" + tableIndex;
}
}Key Breakthrough: Embedding the gene in the ID ensures that orders from the same user always land on the same shard and allows direct shard location from the order ID.
3. Cross‑Shard Queries
3.1 Heterogeneous Index Table Solution
Elasticsearch Index Structure :
{
"order_index": {
"mappings": {
"properties": {
"order_no": { "type": "keyword" },
"shard_key": { "type": "integer" },
"create_time": { "type": "date" }
}
}
}
}4.2 Global Secondary Index (GSI)
-- Create a global index in ShardingSphere
CREATE SHARDING GLOBAL INDEX idx_merchant ON orders(merchant_id)
BY SHARDING_ALGORITHM(merchant_hash)
WITH STORAGE_UNIT(ds_0, ds_1);4. Data Migration
Dual‑Write Migration Plan :
Gray‑Scale Switch Steps :
Enable dual‑write (rollback to old DB if new DB write fails).
Migrate historical data in batches.
Real‑time incremental data verification and auto‑repair.
Gradually shift traffic by user ID from 1 % to 100 %.
5. Pitfall Guide
5.1 Hotspot Issues
During a major sales event, all orders of a popular store were routed to the same shard.
Solution: Use a composite sharding key, e.g., (merchant_id + user_id) % 1024.
5.2 Distributed Transactions
Final consistency is achieved with RocketMQ:
// Final consistency implementation
@Transactional
public void createOrder(Order order) {
orderDao.insert(order); // write primary DB
rocketMQTemplate.sendAsync("order_create_event", order); // send message
}
@RocketMQMessageListener(topic = "order_create_event")
public void handleEvent(OrderEvent event) {
bonusService.addPoints(event.getUserId()); // async points
inventoryService.deduct(event.getSkuId()); // async inventory
}5.3 Pagination Traps
Cross‑shard pagination can cause page number inconsistencies.
Solution: Use Elasticsearch aggregation or limit queries to recent three months.
6. Ultimate Architecture
Performance Metrics :
Scenario | Before Split | After Split
------------------------------------------------
User order query | 3200 ms | 68 ms
Merchant export | timeout | 8 s
Full‑table stats | unavailable | 1.2 s (approx.)Conclusion
Sharding key selection matters more than effort: Gene sharding is the best partner for order systems.
Plan for growth: Design to support at least two years of data increase.
Avoid over‑design: Small‑table joins are far cheaper than distributed joins.
Monitor‑driven optimization: Focus on shards with skew > 15 %.
True architectural art lies in balancing division and integration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
