Databases 14 min read

How to Scale Order Systems with Horizontal Database Sharding: A Real‑World Case Study

This article presents a comprehensive, practice‑driven analysis of horizontal database sharding for a high‑traffic e‑commerce order system, covering sharding dimensions, strategies, quantity planning, transparent routing, pagination challenges, lookup mapping, overall architecture, deployment steps, and the measurable performance and cost benefits achieved.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How to Scale Order Systems with Horizontal Database Sharding: A Real‑World Case Study

Horizontal Sharding Overview

With the rapid growth of large‑scale internet applications, massive data storage and access become bottlenecks, making distributed processing essential. Horizontal sharding (splitting a large table into multiple tables stored in different databases) is a high‑difficulty but effective solution.

Sharding Dimensions

Choosing the sharding key should minimize impact on application code and SQL performance. By collecting all SQL statements and counting the occurrence of filter fields (userId, orderId, merchantId), the analysis shows that userId appears most frequently as a single‑value filter, making it the optimal sharding dimension.

Statistical results (500 SQL statements): userId – 120 single, 40 multi; orderId – 60 single, 80 multi; merchantId – 15 single, 0 multi. Further weighting by execution frequency confirms that sharding by userId routes 85% of the top‑15 SQLs to a single shard, greatly reducing cross‑shard scans.

Sharding Strategy

Two common methods are range‑based and modulo‑based (mod) sharding. Range sharding allows gradual growth of shard count, while mod sharding provides uniform data distribution and avoids hotspots. In practice, mod sharding is preferred for its simplicity, and shard count is typically doubled during re‑sharding to minimize data movement.

Sharding Quantity

Shard count depends on single‑shard capacity (≈50 million rows for MySQL, ≈100 million rows for Oracle). Too few shards fail to relieve pressure; too many increase cross‑shard query cost and hardware investment. An initial recommendation is 4–8 shards, each on a dedicated physical machine.

Transparent Routing

Sharding changes the DB schema, so routing should be handled in the data‑access layer (DAL) to keep application code transparent. Single‑shard queries are routed automatically; multi‑shard queries are aggregated by the DAL, and aggregation‑heavy queries can be post‑processed at the application level.

Pagination Handling

Cross‑shard pagination requires fetching extra rows from each shard and merging results, which becomes increasingly expensive for later pages. Solutions include limiting visible pages, increasing page size for batch jobs, or delegating aggregation to a big‑data platform.

Lookup Mapping

A lookup table maps orderId to userId, allowing direct single‑shard access when only orderId is known. The table, equal in size to the order table but with only two columns, is cached in memory for fast retrieval.

Overall Architecture

The architecture consists of an order service/proxy, a distributed DAL, MySQL shards, a lookup table with cache, and optional read‑write splitting. The diagram below illustrates the flow.

Overall technical architecture of horizontal sharding
Overall technical architecture of horizontal sharding

Deployment Steps

The migration proceeds in two phases: first run Oracle and MySQL in parallel, synchronizing data incrementally; then gradually shift non‑real‑time workloads to MySQL, followed by a full cut‑over of real‑time reads/writes after performance verification.

Project Summary

After sharding, the order system migrated from Oracle to MySQL, reduced hardware costs, and achieved performance parity (average SQL latency unchanged) across six MySQL shards. The design also eliminated the need for the lookup mechanism as new order IDs embed the sharding suffix, and the overall approach proved reliable and repeatable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data MigrationmysqlOracledatabase scalingorder systemhorizontal sharding
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.