Databases 10 min read

Choosing and Generating Sharding Keys for Database Partitioning

This article explains how to select appropriate sharding keys based on business requirements and data access patterns, describes vertical and horizontal partitioning methods, and presents common techniques such as equal‑value, index, and gene methods along with key generation strategies like auto‑increment, UUID, Snowflake algorithm, and open‑source middleware.

JD Tech

Feb 19, 2024

Choosing and Generating Sharding Keys for Database Partitioning

When a single relational database reaches its limits in load, connections, or concurrency, it is necessary to split the database and its tables to improve performance and scalability. This article discusses vertical and horizontal splitting, the importance of choosing the right sharding key, and various methods for generating sharding keys.

Vertical and Horizontal Splitting

Vertical splitting separates a large database into multiple logical databases, each containing a subset of tables (e.g., user, product, order databases). Within a table, vertical splitting can further divide the table into multiple related tables that store different aspects of the same entity.

Horizontal splitting expands a single table into multiple tables that each contain the same schema but store different rows (e.g., orders are distributed across several tables). The data distribution is usually based on a sharding key.

Choosing the Sharding Key

The sharding key determines how data is distributed across databases/tables. Generally, the dimension most frequently used in queries should be selected as the sharding key (e.g., order number for the order table, product code for the product table). When queries involve non‑sharding fields, three approaches can be used:

Equal‑value method: Align the non‑sharding field value with the sharding key (e.g., make the waybill number identical to the order number).

Index method: Create an index that maps the non‑sharding field to the sharding key, then query via the sharding key.

Gene method: Use a common rule (e.g., low‑bits of a 64‑bit ID) to generate both sharding and non‑sharding identifiers, allowing direct lookup.

Example Tables

Sharding Key

Product Code

Address

Order_id

Sku_code

address

Sharding Key

Order Number

Weight

Waybill_code

Order_id

weight

Generating the Sharding Key

Several common strategies are used to generate sharding keys, chosen according to business volume and concurrency:

Database auto‑increment primary key: Suitable for low‑concurrency scenarios.

UUID: Generated via Java's java.util.UUID; globally unique but not human‑readable.

Snowflake algorithm: Produces a 64‑bit integer composed of a sign bit, 41‑bit timestamp, 10‑bit machine identifier, and 12‑bit sequence number, allowing up to 4096 IDs per millisecond per machine.

Open‑source middleware: Examples include Meituan's Leaf and Baidu's Uid‑Generator .

Conclusion

Effective sharding requires careful selection of the sharding key based on query patterns, and a suitable generation method that balances uniqueness, readability, and performance. Combining vertical and horizontal partitioning with appropriate key strategies enables databases to scale with growing business demands.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Database Sharding UUID Snowflake Horizontal Partition splitting-key Vertical Partition

Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.