Databases 8 min read

How to Scale Billion-Order Databases with Sharding, Partitioning, and Cold Data Strategies

This article explains how to handle billions of orders by classifying hot and cold data, storing them in MySQL, Elasticsearch, and Hive, and applying MySQL sharding, table splitting, and combined database‑table strategies to improve scalability and performance.

Java Backend Technology

Sep 21, 2018

How to Scale Billion-Order Databases with Sharding, Partitioning, and Cold Data Strategies

Background

As order volume grows to billions over three months, a single‑database single‑table design can no longer meet performance and cost requirements.

Order Data Segmentation

Orders are divided into hot data (last 3 months) and two tiers of cold data: A (3‑12 months) and B (older than 1 year). Hot data requires fast, real‑time queries; cold data can tolerate slower access.

Hot data: stored in MySQL with sharding and table partitioning.

Cold data A: stored in Elasticsearch for relatively fast search.

Cold data B: archived in Hive for occasional offline analysis.

MySQL Sharding and Partitioning

Business‑Level Splitting

Initially many services share one database, but as the system grows, separating business domains (user, product, order, etc.) into different databases reduces contention and improves throughput.

Separate databases for each business domain distribute load across multiple machines.

Table Splitting Strategy

Use order_id as the shard key; create many tables (e.g., 100) and route rows by order_id % 100. If queries need to be based on user_id, the shard key must be changed accordingly.

CREATE TABLE `order` (
  order_id BIGINT(11) NOT NULL,
  ...
);

Example of modulo routing for table selection:

Database Splitting Strategy

Similar modulo routing can be applied at the database level: order_id % number_of_databases. Non‑numeric IDs can be hashed before modulo.

Combined Sharding Strategy

When both database and table sharding are used, a composite formula distributes rows across databases and tables.

Example with 10 databases each containing 100 tables: order_id = 1001 maps to database 1, table 2.

Overall Architecture

Read/write requests are split: writes follow the sharding rules; reads first determine whether the order belongs to hot or cold data based on the timestamp embedded in order_id. Cold data older than a year is queried from Hive.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql cold data

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.