Databases 12 min read

How Dazhong Dianping Scaled Its Order System with Vertical & Horizontal Sharding

This article details Dazhong Dianping's practical approach to sharding its massive order database, covering vertical and horizontal partitioning strategies, specific sharding algorithms, unique ID design, data migration phases, and lessons learned for long‑term scalability.

dbaplus Community

Feb 24, 2016

How Dazhong Dianping Scaled Its Order System with Vertical & Horizontal Sharding

Background

The order table exceeded 200 GB. Even with two read replicas and extensive index tuning, query latency remained high, especially during large flash‑sale events. Rate‑limiting and asynchronous queues were used to protect the service, but the monolithic order model could not accommodate evolving business requirements. Modifying the original DDL was deemed too risky, prompting a need for database partitioning.

Vertical Partitioning

The original order database was split vertically into several logical stores, such as a basic order database and an order‑process database. This reduced the load on any single schema and prepared the system for further horizontal scaling.

Horizontal Partitioning

A unified order model was introduced to serve end‑users, merchants, customer‑service agents, and operations teams. Data is sharded by userId and shopId . Changes are synchronized to an operational analytics database via the PUMA replication framework.

Sharding Strategies

1. Query‑Based Sharding

A dedicated mapping table records the relationship between an identifier and its target database.

Advantages: The mapping algorithm can be changed without altering the data layout.

Disadvantages: Introduces a single point of failure for the mapping service.

2. Range Sharding

Rows are divided by time intervals or contiguous ID ranges.

Advantages: Table size stays manageable; natural horizontal scaling.

Disadvantages: Does not alleviate write‑hotspot problems when many writes target the same range.

3. Hash Sharding (mod)

Rows are assigned using a modulo operation. The article recommends a consistent‑hash style using mod 2^n to simplify future expansion.

Example: a 32 × 32 scheme. The last four bits of userId ( userId mod 32) select one of 32 databases; the same bits divided by 32 and then mod 32 select one of 32 tables inside the chosen database, yielding 1 024 tables across eight clusters (each cluster contains four databases).

Scalability Scenarios

Scenario 1 – Database performance ceiling

Method 1: Keep the existing sharding rule and add up to 32 database clusters.

Method 2: Adjust the sharding rule to (32*2^n)*(32/2^n), allowing up to 1 024 clusters.

Scenario 2 – Single‑table size limit

If a table reaches 200 GB, the 32 × 32 layout yields roughly 200 TB (≈10 trillion orders). By keeping the same sharding rule and further splitting tables within each database, up to 8 192 tables can be created (limited by the four‑bit suffix). An alternative dimension uses shopId with an 8 × 8 scheme, storing only the order master table for shop‑centric queries.

Unique ID Scheme

Common approaches considered:

Auto‑increment IDs – simple but create a single‑point bottleneck.

Database cluster with step size (Flickr style) – high availability, concise IDs, but requires a dedicated cluster.

Twitter Snowflake – high performance and extensible, but needs its own cluster and ZooKeeper.

GUID / Random – easy to generate but longer and risk collisions.

The chosen scheme avoids any independent ID service. An order ID is composed of timestamp + user identifier + random number :

Low cost and easy to generate in application code.

Practically zero collision probability.

The user identifier is the last four bits of userId, embedding the sharding rule directly in the ID. This enables direct mapping from an order ID to its database and table without a separate lookup.

IDs are sortable because the timestamp is the leading component.

Drawbacks: the identifier is slightly longer than a plain INT / BIGINT and generation is marginally slower.

Other Considerations

Transaction support: The entire order aggregate is sharded consistently, so transactions that span the aggregate remain feasible.

Complex queries: After vertical partitioning, cross‑table joins are eliminated. After horizontal partitioning, every query must include the sharding key (userId or shopId). Queries that omit the sharding dimension are discouraged, even if middleware could assemble results in memory.

Data Migration Process

The migration was performed in three stages.

Stage 1

Dual‑write to both the legacy and the new schemas; transaction success is determined by the old model. Daily reconciliation jobs (via the data‑warehouse) patch any discrepancies. Historical data is imported through batch jobs.

Stage 2

Historical data import is completed and verified. Dual‑write continues, but transaction success now follows the new model; online queries are routed to the new schema. Daily reconciliation persists.

Stage 3

The legacy model stops receiving writes; only orders that reach a final state are back‑filled asynchronously. At this point, only offline processes depend on the old model. Once downstream data‑warehouse transformations are finished, the legacy model can be retired.

Reflections

Not every table requires horizontal splitting; assess growth rate and complexity before applying.

Separate online transactional workloads from offline analytical workloads to reduce contention.

Choosing the right sharding dimension is critical for performance and developer ergonomics.

Databases are not infinitely robust; use simple, well‑indexed queries to keep the system controllable and scalable over the long term.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Database Sharding horizontal scaling Vertical Partitioning order system Unique ID

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Vertical Partitioning

Horizontal Partitioning

Sharding Strategies

1. Query‑Based Sharding

2. Range Sharding

3. Hash Sharding (mod)

Scalability Scenarios

Scenario 1 – Database performance ceiling

Scenario 2 – Single‑table size limit

Unique ID Scheme

Other Considerations

Data Migration Process

Stage 1

Stage 2

Stage 3

Reflections

dbaplus Community

How this landed with the community

Was this worth your time?

0 Comments

Scenario 1 – Database performance ceiling

Scenario 2 – Single‑table size limit

Stage 1

Stage 2

Stage 3