Vertical and Horizontal Database Sharding Strategies for Large-Scale Order Systems
Facing a 200 GB order table that strained performance during flash‑sale spikes, Meituan first vertically split the schema, then horizontally sharded orders by user and merchant IDs using a 32 × 32 hash scheme with embedded timestamp‑based IDs, enabling scalable clusters, flexible migration stages, and simplified queries while preserving transaction integrity.
Background
The original Meituan order table exceeded 200 GB and, despite adding two read replicas and optimizing indexes, many queries remained unsatisfactory. Massive flash‑sale events pushed the database to its limits, forcing the application to rely on rate limiting and asynchronous queues. Business requirements kept evolving, making the existing order model insufficient, while modifying the original DDL was extremely costly. Consequently, splitting the order database became urgent, with the goal of eliminating capacity concerns for the next ten years.
Vertical Partitioning
The order database was first vertically partitioned into a basic order database, an order‑process database, etc. (details omitted).
Horizontal Partitioning
Vertical partitioning relieved pressure on the single cluster but still struggled during flash sales. A new unified order model was designed to serve C‑end users, B‑end merchants, customer service, and operations. Data is sharded by user ID and merchant ID, and synchronized to an operational database via PUMA (an internal MySQL binlog parsing service).
Sharding Strategies
1. Query‑Based Sharding
The mapping between IDs and databases is stored in a separate database.
Advantages: the ID‑to‑database mapping algorithm can be changed arbitrarily. Disadvantages: introduces an additional single point of failure.
2. Range Sharding
Sharding by time intervals or ID ranges.
Advantages: table size is controllable and naturally supports horizontal scaling. Disadvantages: cannot solve concentrated write bottlenecks.
3. Hash Sharding
Typically uses Mod. The article recommends consistent hashing with mod 2^n.
For the unified order database, a 32 × 32 scheme is used: the lower four bits of UserId mod 32 determine the database, and the same bits divided by 32 mod 32 determine the table, resulting in 1 024 tables. The production deployment consists of 8 clusters (primary‑secondary), each with 4 databases.
Scalability Scenarios
Scenario 1: Database Performance Bottleneck
Method 1: Expand to 32 database clusters directly.
Method 2: If 32 clusters are still insufficient, adjust the sharding rule to (32 × 2^n) × (32 / 2^n), supporting up to 1 024 clusters.
Scenario 2: Single Table Capacity Bottleneck
Method: When a single table exceeds 200 GB (≈200 TB for 1 024 tables), further split tables within each database. Using the lower four bits of UserId limits the maximum to 8 192 tables. An alternative dimension uses ShopID with an 8 × 8 scheme (details omitted).
Unique ID Scheme
Several common schemes are listed:
Auto‑increment ID (simple but single‑point risk)
Clustered DB with step size (Flickr style, high availability)
Twitter Snowflake (high performance, requires ZK)
GUID/Random algorithms (simple but longer and possible collisions)
Our chosen scheme avoids independent clusters and combines business attributes:
Timestamp + User Identifier + Random Number
Benefits:
Low cost and easy to implement
Practically no duplication
Embedded sharding rule (user identifier = last four bits of UserId), allowing order number alone to locate the correct shard
Sortable because the timestamp is first
Drawbacks: slightly longer length and marginally slower than int/bigint.
Other Considerations
Transaction Support: The entire order domain aggregate is sharded consistently, so transactions across the aggregate are supported.
Complex Queries: After vertical sharding, joins are eliminated; after horizontal sharding, queries must include the sharding key (e.g., user ID). Queries that lack a sharding dimension should be avoided in the online DB.
Data Migration
Database splitting is an optimization performed after the business reaches a certain scale. The migration proceeds in three stages:
Stage 1
Dual‑write to old and new databases (transaction success follows old model); reads use the old model.
Daily job reconciles data via DW and patches differences.
Historical data is imported via jobs.
Stage 2
Historical data import completed and verified.
Dual‑write continues, but transaction success now follows the new model; online reads switch to the new model.
Daily reconciliation continues.
Stage 3
The old model stops receiving writes; it only receives asynchronous updates when an order reaches a final state.
Only offline processes still depend on the old model; once DW is refactored, the old model can be retired.
Conclusion
Not every table requires horizontal sharding; it should be applied only when growth type and speed justify the added development complexity. For high‑concurrency workloads, isolate online and offline queries, as well as transaction versus operational queries. Choose sharding dimensions that solve existing problems while remaining developer‑friendly. Databases are not as robust as imagined; protect them with simple, well‑indexed queries to ensure long‑term capacity planning and horizontal scalability.
Special thanks to the DBA team and middleware team for their strong support.
本文头图出自RethinkDB。Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
