Databases 9 min read

How to Scale an Order Database Beyond 200GB: Vertical & Horizontal Sharding Strategies

This article explains how a rapidly growing order database exceeding 200 GB can be scaled using vertical and horizontal partitioning, various sharding strategies, custom ID generation, migration phases, and practical considerations for transactions and complex queries.

21CTO

Apr 27, 2016

How to Scale an Order Database Beyond 200GB: Vertical & Horizontal Sharding Strategies

Background

The order table has already exceeded 200 GB, and despite adding two read replicas and multiple index optimizations, many queries remain sub‑optimal. Heavy flash‑sale events have pushed the database to its limits, requiring rate‑limiting and asynchronous queues, while the original order model can no longer satisfy evolving business requirements, making database sharding urgent.

Vertical Splitting

The original order database is first split vertically into a basic order database, an order‑process database, and others (details omitted).

Horizontal Splitting

Vertical splitting eases single‑cluster pressure but still struggles during flash sales. A new unified order model is designed, partitioned by user ID and merchant ID, and synchronized to an operations database via PUMA.

Splitting Strategies

1. Query Splitting

The mapping between ID and database is stored in a separate database.

Advantages: The ID‑to‑database mapping algorithm can be changed freely. Disadvantages: Introduces an additional single point of failure.

2. Range Splitting

Splitting by time range or ID range keeps each table size manageable and naturally supports horizontal scaling, but cannot solve concentrated write bottlenecks.

3. Hash Splitting

Typically uses a modulo operation; the article focuses on the mod strategy.

For a unified order database, a 32 × 32 scheme is used: the last four bits of userId mod 32 assign to one of 32 databases, then those bits divided by 32 mod 32 assign to one of 32 tables, resulting in 1 024 tables across 8 clusters (primary‑secondary).

ID Generation

Common schemes include:

Database auto‑increment ID – simple but a single‑point risk and performance bottleneck.

Clustered step (Flickr scheme) – high availability, concise IDs, but requires a dedicated DB cluster.

Twitter Snowflake – high performance, highly available, scalable, but needs an independent cluster and ZooKeeper.

GUID/Random algorithms – simple but produce long IDs with collision risk.

Our solution avoids any scheme requiring a separate cluster. We use a business‑aware ID composed of timestamp + user identifier + random number, offering low cost, near‑zero duplication, built‑in sharding (user identifier = last four bits of userId), sortable timestamps, and acceptable performance.

Transaction & Complex Query Support

Because the entire order domain is split with consistent dimensions, transactions across the aggregate are supported. After vertical splitting, joins are eliminated; after horizontal splitting, queries must include the sharding key (e.g., userId) and cannot span dimensions without middleware assistance.

Data Migration

Phase 1

Database double‑write (transaction success follows the old model), queries use the old model, daily reconciliation jobs via DW, and historical data import.

Phase 2

Historical data fully imported and verified; writes are double‑written but transaction outcome follows the new model, online queries use the new model, and daily jobs reconcile differences.

Phase 3

The old model no longer writes synchronously; only terminal orders are asynchronously back‑filled. Offline processes still rely on the old model until downstream dependencies are fully migrated.

Key Takeaways

Not every table needs horizontal splitting; use it only when growth type and speed demand it.

Separate online and offline queries, and isolate transaction workloads from analytics.

Choose split dimensions that solve existing problems and simplify development.

Keep queries simple and well‑indexed to ensure long‑term scalability and capacity planning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Migration database sharding Horizontal Partitioning Vertical Partitioning Order Management ID Generation

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Vertical Splitting

Horizontal Splitting

Splitting Strategies

1. Query Splitting

2. Range Splitting

3. Hash Splitting

ID Generation

Transaction & Complex Query Support

Data Migration

Phase 1

Phase 2

Phase 3

Key Takeaways

21CTO

How this landed with the community

Was this worth your time?

0 Comments

Phase 1

Phase 2

Phase 3