Databases 8 min read

Mastering Database Sharding: Horizontal & Vertical Splits, Key Selection & Generation

This article explains why single‑database resources hit limits, introduces horizontal and vertical sharding techniques, discusses how to choose effective split keys, and reviews common key‑generation methods such as auto‑increment, UUID, and Snowflake IDs for scalable database architectures.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
Mastering Database Sharding: Horizontal & Vertical Splits, Key Selection & Generation

Horizontal and Vertical Sharding

When a single database reaches its capacity for load, connections, or concurrency, it must be split into multiple databases or tables. In an e‑commerce scenario, a monolithic database containing users, products, and orders eventually becomes insufficient, prompting vertical and horizontal partitioning.

Vertical Sharding : Split the database and its tables into separate logical groups, e.g., a user database, a product database, and an order database. An order table can be further divided into basic info, shipping address, and order items, each storing a portion of the order data.

Horizontal Sharding : Replicate the entire schema across multiple databases/tables, each holding a full set of rows for a subset of the data (e.g., orders for specific ID ranges). All shards retain the same structure as the original unsharded table.

Choosing a Split Key

The split key should align with the primary query dimension. For example, use

order_id

for the order table and

sku_code

for the product table. When non‑split‑key queries are needed, three techniques can be applied:

1. Equality Method : Align the non‑split‑key value with the split key, such as using the same value for order number and waybill number, allowing direct lookup across tables.

2. Index Method : Create an index that maps a non‑split‑key (e.g., user code or waybill number) to the corresponding split key (order number), then query via the split key.

3. Gene Method : Use a deterministic rule (e.g., low‑4‑bit modulo of a 64‑bit Long ID) to map both split and non‑split keys to the same shard, ensuring consistent placement.

Generating Split Keys

Common generation strategies include:

1. Auto‑Increment Primary Key : Suitable for low concurrency scenarios; MySQL’s auto‑increment can serve as the split key.

2. UUID : Generated via Java’s

UUID

class, combining timestamp, clock sequence, and a globally unique machine identifier. UUIDs are unique but not human‑readable.

3. Snowflake Algorithm : Produces a 64‑bit integer composed of a sign bit (unused), 41 bits for timestamp, 10 bits for machine ID, and 12 bits for a per‑millisecond sequence, allowing up to 4096 IDs per millisecond. Care must be taken with machine‑ID limits, ID recycling, and clock rollback handling.

Summary

When a single database cannot meet business demands, sharding—whether vertical, horizontal, or a combination—provides a scalable solution. Proper split‑key selection and generation (auto‑increment, UUID, Snowflake) are essential to evenly distribute load and maintain data consistency across shards.

scalabilitydatabase shardinghorizontal partitioningVertical PartitioningID GenerationSplit Key
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.