Backend Development 9 min read

Mastering Hot‑Key Challenges in High‑Concurrency Flash Sale Systems

This article explores comprehensive strategies for handling hot‑key bottlenecks in high‑traffic flash‑sale scenarios, covering DB, Redis, sharding, local cache, failure handling, and scaling techniques across various QPS ranges to ensure reliable inventory control and user experience.

Tech Architecture Stories

Sep 12, 2023

Mastering Hot‑Key Challenges in High‑Concurrency Flash Sale Systems

The article discusses solutions for hot‑key problems in high‑concurrency flash‑sale (秒杀) scenarios, which also apply to prize‑allocation events, holiday cash‑splitting, and red‑packet grabs.

Why

Three modules are illustrated: the shipping module (checks inventory and ships items), the settlement module (credits money to user accounts), and the usage module (users withdraw or spend the credited balance). The focus is on the shipping flow, where high traffic creates hot keys.

How

qps < 1k

Use the database directly for inventory and user results.

Low‑traffic services can rely on the DB for inventory control.

Read traffic can be offloaded to read‑replicas.

If inventory exists, successful users read the result; if not, they receive a sold‑out response. Security can be enhanced by caching black‑list of malicious actors at the entry layer.

1k < qps < 10w

At this traffic level, the DB becomes a bottleneck; Redis single‑key can handle up to 100k QPS for inventory.

Users first check if they have already succeeded; if so, return immediately.

After successful inventory deduction, ensure settlement succeeds; on failure, retry asynchronously for eventual consistency.

Settlement stores user results in a cache to block repeat requests.

If inventory is exhausted, return sold‑out.

Failure handling:

On timeout, cannot confirm deduction; inform user to retry. Reconcile later to compute compensation: compensation = estimated inventory – actual settled – remaining stock.

On non‑timeout errors, return failure and let user retry.

10w < qps < 100w

Redis single‑key also hits limits; adopt sharding to mitigate hot‑key pressure.

Split inventory across 10 Redis keys using a hash; precise quota control but introduces fragmentation and possible false‑out‑of‑stock responses.

Keep a single inventory key in DB; each service node periodically pulls a fixed amount into a local cache for pre‑deduction. This reduces DB load but can cause inconsistencies if nodes are rescheduled or cache depletes.

A multi‑shard local cache approach is proposed: inventory is partitioned by key (formula: max(min(stock, qps/100k), 1)). New cache nodes load data from DB into local memory.

Shipping request hashes to a local shard; if stock exists, deduct from DB.

If DB deduction fails due to cache‑DB mismatch, update local cache (e.g., 1→0).

Select another non‑zero shard and repeat.

When DB stock is lower than cache, adjust cache accordingly (e.g., 9→6).

If all local shards are zero, inventory is exhausted; the cache acts like a Bloom filter—absence guarantees no stock.

This design resolves intermittent user experiences and allows later users to succeed when earlier ones failed, while keeping DB pressure low.

When DB runs out of stock but local cache still shows stock, up to 10 concurrent deductions may occur before convergence; testing shows a 1.003756× request amplification at 200k QPS.

After admin replenishment, cache nodes periodically sync inventory.

100w < qps < 1000w

Referencing WeChat’s 2015 Spring Red‑Packet system:

Deploy the red‑packet service independently.

Place all logic at the entry layer to reduce downstream calls.

Control rate to evenly distribute red‑packets and reclaim unused ones.

Only users who actually grab a red‑packet proceed to settlement.

qps > 1000w

Beyond system capacity, employ loss‑tolerant services at the entry layer to throttle traffic. Various loss strategies exist and will be covered in a dedicated article.

Conclusion

Prefer DB if it can handle the load.

Prefer a single NoSQL key if it can handle the load.

When single‑key pressure exceeds capacity, use sharding.

For massive traffic, combine sharding with proactive local caching.

If traffic exceeds system limits, apply loss‑tolerant services.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

database Redis caching High Concurrency Flash Sale Hot Key

Written by

Tech Architecture Stories

Internet tech practitioner sharing insights on business architecture, technology, and a lifelong love of tech.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.