Databases 12 min read

How Dazhong Dianping Scaled Its Order System with Vertical & Horizontal Sharding

This article details Dazhong Dianping's practical approach to sharding its massive order database, covering vertical and horizontal partitioning strategies, specific sharding algorithms, unique ID design, data migration phases, and lessons learned for long‑term scalability.

dbaplus Community
dbaplus Community
dbaplus Community
How Dazhong Dianping Scaled Its Order System with Vertical & Horizontal Sharding

Background

The order table exceeded 200 GB. Even with two read replicas and extensive index tuning, query latency remained high, especially during large flash‑sale events. Rate‑limiting and asynchronous queues were used to protect the service, but the monolithic order model could not accommodate evolving business requirements. Modifying the original DDL was deemed too risky, prompting a need for database partitioning.

Vertical Partitioning

The original order database was split vertically into several logical stores, such as a basic order database and an order‑process database. This reduced the load on any single schema and prepared the system for further horizontal scaling.

Horizontal Partitioning

A unified order model was introduced to serve end‑users, merchants, customer‑service agents, and operations teams. Data is sharded by userId and shopId . Changes are synchronized to an operational analytics database via the PUMA replication framework.

Sharding Strategies

1. Query‑Based Sharding

A dedicated mapping table records the relationship between an identifier and its target database.

Query‑Based Sharding Diagram
Query‑Based Sharding Diagram

Advantages: The mapping algorithm can be changed without altering the data layout.

Disadvantages: Introduces a single point of failure for the mapping service.

2. Range Sharding

Rows are divided by time intervals or contiguous ID ranges.

Range Sharding Diagram
Range Sharding Diagram

Advantages: Table size stays manageable; natural horizontal scaling.

Disadvantages: Does not alleviate write‑hotspot problems when many writes target the same range.

3. Hash Sharding (mod)

Rows are assigned using a modulo operation. The article recommends a consistent‑hash style using mod 2^n to simplify future expansion.

Hash Sharding Diagram
Hash Sharding Diagram

Example: a 32 × 32 scheme. The last four bits of userId ( userId mod 32) select one of 32 databases; the same bits divided by 32 and then mod 32 select one of 32 tables inside the chosen database, yielding 1 024 tables across eight clusters (each cluster contains four databases).

Scalability Scenarios

Scenario 1 – Database performance ceiling

Method 1: Keep the existing sharding rule and add up to 32 database clusters.

Add clusters diagram
Add clusters diagram

Method 2: Adjust the sharding rule to (32*2^n)*(32/2^n), allowing up to 1 024 clusters.

Adjusted sharding rule diagram
Adjusted sharding rule diagram

Scenario 2 – Single‑table size limit

If a table reaches 200 GB, the 32 × 32 layout yields roughly 200 TB (≈10 trillion orders). By keeping the same sharding rule and further splitting tables within each database, up to 8 192 tables can be created (limited by the four‑bit suffix). An alternative dimension uses shopId with an 8 × 8 scheme, storing only the order master table for shop‑centric queries.

Table splitting diagram
Table splitting diagram

Unique ID Scheme

Common approaches considered:

Auto‑increment IDs – simple but create a single‑point bottleneck.

Database cluster with step size (Flickr style) – high availability, concise IDs, but requires a dedicated cluster.

Twitter Snowflake – high performance and extensible, but needs its own cluster and ZooKeeper.

GUID / Random – easy to generate but longer and risk collisions.

The chosen scheme avoids any independent ID service. An order ID is composed of timestamp + user identifier + random number :

Low cost and easy to generate in application code.

Practically zero collision probability.

The user identifier is the last four bits of userId, embedding the sharding rule directly in the ID. This enables direct mapping from an order ID to its database and table without a separate lookup.

IDs are sortable because the timestamp is the leading component.

Drawbacks: the identifier is slightly longer than a plain INT / BIGINT and generation is marginally slower.

Other Considerations

Transaction support: The entire order aggregate is sharded consistently, so transactions that span the aggregate remain feasible.

Complex queries: After vertical partitioning, cross‑table joins are eliminated. After horizontal partitioning, every query must include the sharding key (userId or shopId). Queries that omit the sharding dimension are discouraged, even if middleware could assemble results in memory.

Data Migration Process

The migration was performed in three stages.

Stage 1

Dual‑write to both the legacy and the new schemas; transaction success is determined by the old model. Daily reconciliation jobs (via the data‑warehouse) patch any discrepancies. Historical data is imported through batch jobs.

Migration stage 1 diagram
Migration stage 1 diagram

Stage 2

Historical data import is completed and verified. Dual‑write continues, but transaction success now follows the new model; online queries are routed to the new schema. Daily reconciliation persists.

Migration stage 2 diagram
Migration stage 2 diagram

Stage 3

The legacy model stops receiving writes; only orders that reach a final state are back‑filled asynchronously. At this point, only offline processes depend on the old model. Once downstream data‑warehouse transformations are finished, the legacy model can be retired.

Migration stage 3 diagram
Migration stage 3 diagram

Reflections

Not every table requires horizontal splitting; assess growth rate and complexity before applying.

Separate online transactional workloads from offline analytical workloads to reduce contention.

Choosing the right sharding dimension is critical for performance and developer ergonomics.

Databases are not infinitely robust; use simple, well‑indexed queries to keep the system controllable and scalable over the long term.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databaseshardinghorizontal scalingVertical Partitioningorder systemUnique ID
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.