How to Scale Billion‑Row MySQL Order Tables with Sharding and Gene‑Based Partitioning
This article explains how to handle MySQL order tables that have grown to billions of rows by applying vertical and horizontal sharding, gene‑based partitioning, cross‑shard query strategies, data migration techniques, and practical pitfalls to achieve massive performance improvements.
Preface
Scenario pain point: An e‑commerce platform’s MySQL order table reached 700 million rows, causing fatal issues such as a simple query taking 12 seconds and a full‑table count taking 278 seconds.
Core contradictions:
B+Tree index depth reached five levels, causing massive disk I/O.
Single table size exceeded 200 GB, making backup windows longer than six hours.
Write concurrency hit 8 000 QPS, with master‑slave replication lag up to 15 minutes.
Key insight: When a single table exceeds 50 million rows, a sharding plan should be initiated.
Given a billion‑row order dataset, how should we shard it?
1 Sharding Core Strategies
1.1 Vertical Splitting: Reduce Data First
Optimization effects:
Core table size reduced by 60 %.
High‑frequency query fields concentrated, improving cache hit rate.
1.2 Horizontal Splitting: Ultimate Solution
Shard key selection three principles:
Discreteness: Avoid hotspots (e.g., user_id is better than status).
Business relevance: 80 % of queries must include the field.
Stability: Values should not change with business logic (avoid using phone numbers).
Shard strategy comparison:
Strategy Type Applicable Scenario Expansion Complexity Example
Range Sharding Time‑range queries Simple create_time per month
Hash Modulo Uniform distribution Difficult user_id % 128
Consistent Hash Dynamic scaling Medium Ketama algorithm
Gene Sharding Avoid cross‑shard queries Complex Extract gene from user_id2 Gene Sharding
High‑frequency queries in the order system:
User historical orders (user_id)
Merchant order lookup (merchant_id)
Customer service order lookup (order_no)
Solution:
Snowflake Order ID Generator:
public class OrderIdGenerator {
// 64‑bit ID: sign(1) + timestamp(41) + gene(12) + sequence(10)
private static final int GENE_BITS = 12;
public static long generateId(long userId) {
long timestamp = System.currentTimeMillis() - 1288834974657L;
// Extract lower 12 bits of userId as gene
long gene = userId & ((1L << GENE_BITS) - 1);
long sequence = ...; // obtain sequence number
return (timestamp << 22) | (gene << 10) | sequence;
}
// Extract shard key from order ID
public static int getShardKey(long orderId) {
return (int) ((orderId >> 10) & 0xFFF); // middle 12 bits
}
}Routing logic:
public class OrderShardingRouter {
// 8 databases, each with 16 tables
private static final int DB_COUNT = 8;
private static final int TABLE_COUNT_PER_DB = 16;
public static String route(long orderId) {
int gene = OrderIdGenerator.getShardKey(orderId);
int dbIndex = gene % DB_COUNT;
int tableIndex = gene % TABLE_COUNT_PER_DB;
return "order_db_" + dbIndex + ".orders_" + tableIndex;
}
}3 Cross‑Shard Queries
3.1 Heterogeneous Index Table Solution
Elasticsearch index mapping:
{
"order_index": {
"mappings": {
"properties": {
"order_no": { "type": "keyword" },
"shard_key": { "type": "integer" },
"create_time": { "type": "date" }
}
}
}
}4.2 Global Secondary Index (GSI)
-- Create global index in ShardingSphere
CREATE SHARDING GLOBAL INDEX idx_merchant ON orders(merchant_id)
BY SHARDING_ALGORITHM(merchant_hash)
WITH STORAGE_UNIT(ds_0, ds_1);4 Data Migration
Dual‑write migration plan:
Gray‑switch steps:
Enable dual‑write (rollback to old DB if new DB write fails).
Full migration of historical data using paginated batch processing.
Real‑time incremental data verification with automatic repair.
Gradual traffic shift by user ID from 1 % to 100 %.
5 Pitfall Guide
5.1 Hotspot Issue
During Double‑Eleven, a popular shop’s orders were all routed to the same shard.
Solution: introduce a composite shard key (merchant_id + user_id) % 1024.
5.2 Distributed Transaction
Using RocketMQ for eventual consistency:
// Final consistency solution
@Transactional
public void createOrder(Order order) {
orderDao.insert(order); // write master DB
rocketMQTemplate.sendAsync("order_create_event", order); // send message
}
@RocketMQMessageListener(topic = "order_create_event")
public void handleEvent(OrderEvent event) {
bonusService.addPoints(event.getUserId()); // async add points
inventoryService.deduct(event.getSkuId()); // async deduct inventory
}5.3 Pagination Trap
Cross‑shard query page numbers become inconsistent.
Solution: use Elasticsearch aggregation or limit queries to recent three months of orders.
6 Ultimate Architecture Solution
Performance metrics:
Scenario Before Sharding After Sharding
User order query 3200 ms 68 ms
Merchant export timeout failure 8 s completed
Full‑table stats unavailable 1.2 s (approx.)Conclusion
Shard key selection matters more than effort: Gene sharding is the best partner for order systems.
Plan for growth: Design to support at least two years of data increase.
Avoid over‑design: Small‑table joins are far cheaper than distributed joins.
Monitor‑driven optimization: Focus on shards with skew rate > 15 %.
True architectural art lies in balancing division and integration.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
