How to Scale B‑Token Systems with Horizontal Sharding and Consistent Hashing
This article examines the challenges of growing B‑token data volumes, including table size limits and data skew, and proposes a solution using horizontal sharding with a consistent‑hash ring, dynamic table allocation, water‑level thresholds, periodic archiving, and monitoring to support future growth without costly migrations.
1. Background
This article describes the B‑token system used in marketing, where many users are bound to a token that is linked to a promotion. The token lifecycle matches the promotion.
2. Current Structure
The token‑user relationship is one‑to‑many. Initially the system used two database shards, later expanded to eight, storing 120 million rows.
3. Data and Business Situation
Data is unevenly distributed because token IDs are used as the sharding key; some tokens have a few thousand users, others have up to 1.5 million. With only about 20,000 tokens, some shards hold over 30 million rows while others hold only a few million, causing read/write performance degradation. Future growth expects 30 million new rows per month, reaching 600 million rows in a year, which the current design cannot handle.
4. Solution Considerations
4.1 How to solve the problem
Reduce single‑table row count.
Address severe data skew in the current sharding scheme.
Support future data growth.
4.2 Technical Options
Database Sharding
Vertical sharding (splitting columns) is unsuitable because the token data is simple and small. Horizontal sharding (splitting rows) is chosen.
Routing Algorithm
Consistent hashing can map a token ID to a fixed shard number. However, rehashing is required when adding shards, which can be costly and may cause skew if the number of virtual nodes is not balanced.
A consistent‑hash ring with virtual nodes reduces rehashing impact and improves data distribution.
5. Proposed Design
5.1 Dynamic Expansion
Store the hash‑derived shard number when a token is created; subsequent operations use this stored number, allowing new shards to be added without migrating existing data.
5.2 Mitigating Data Skew
Introduce a water‑level threshold for each shard. When a shard exceeds the high water mark, it is excluded from new token allocations, preventing further growth of that shard.
Example: with a 10 million row high water mark, shards below the threshold receive new tokens, while overloaded shards stop receiving new data.
5.3 Periodic Archiving
Expired tokens are archived regularly, reducing shard size and allowing previously overloaded shards to re‑enter the allocation pool.
5.4 Monitoring
When a majority of shards reach the high water mark, an alert is generated for manual intervention, such as adjusting thresholds or adding new shards.
6. Limitations
The water‑level threshold is currently set manually; future work includes automating threshold adjustment based on read/write performance metrics.
7. Conclusion
There is no silver bullet; a combination of horizontal sharding, consistent‑hash routing, dynamic water‑level control, archiving, and monitoring provides a practical solution for scaling the B‑token system.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
