Databases 13 min read

How to Scale B‑Token Systems with Horizontal Sharding and Consistent Hashing

This article examines the challenges of growing B‑token data volumes, including table size limits and data skew, and proposes a solution using horizontal sharding with a consistent‑hash ring, dynamic table allocation, water‑level thresholds, periodic archiving, and monitoring to support future growth without costly migrations.

JD Cloud Developers

Apr 4, 2023

How to Scale B‑Token Systems with Horizontal Sharding and Consistent Hashing

1. Background

This article describes the B‑token system used in marketing, where many users are bound to a token that is linked to a promotion. The token lifecycle matches the promotion.

2. Current Structure

The token‑user relationship is one‑to‑many. Initially the system used two database shards, later expanded to eight, storing 120 million rows.

3. Data and Business Situation

Data is unevenly distributed because token IDs are used as the sharding key; some tokens have a few thousand users, others have up to 1.5 million. With only about 20,000 tokens, some shards hold over 30 million rows while others hold only a few million, causing read/write performance degradation. Future growth expects 30 million new rows per month, reaching 600 million rows in a year, which the current design cannot handle.

4. Solution Considerations

4.1 How to solve the problem

Reduce single‑table row count.

Address severe data skew in the current sharding scheme.

Support future data growth.

4.2 Technical Options

Database Sharding

Vertical sharding (splitting columns) is unsuitable because the token data is simple and small. Horizontal sharding (splitting rows) is chosen.

Routing Algorithm

Consistent hashing can map a token ID to a fixed shard number. However, rehashing is required when adding shards, which can be costly and may cause skew if the number of virtual nodes is not balanced.

A consistent‑hash ring with virtual nodes reduces rehashing impact and improves data distribution.

5. Proposed Design

5.1 Dynamic Expansion

Store the hash‑derived shard number when a token is created; subsequent operations use this stored number, allowing new shards to be added without migrating existing data.

5.2 Mitigating Data Skew

Introduce a water‑level threshold for each shard. When a shard exceeds the high water mark, it is excluded from new token allocations, preventing further growth of that shard.

Example: with a 10 million row high water mark, shards below the threshold receive new tokens, while overloaded shards stop receiving new data.

5.3 Periodic Archiving

Expired tokens are archived regularly, reducing shard size and allowing previously overloaded shards to re‑enter the allocation pool.

5.4 Monitoring

When a majority of shards reach the high water mark, an alert is generated for manual intervention, such as adjusting thresholds or adding new shards.

6. Limitations

The water‑level threshold is currently set manually; future work includes automating threshold adjustment based on read/write performance metrics.

7. Conclusion

There is no silver bullet; a combination of horizontal sharding, consistent‑hash routing, dynamic water‑level control, archiving, and monitoring provides a practical solution for scaling the B‑token system.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Scalable Architecture Data Skew Consistent Hashing

Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.