Backend Development 24 min read

How to Build a High‑Performance Flash Sale System for Massive Concurrency

This guide explains how to design a robust flash‑sale (秒杀) system that handles instant high traffic by combining page staticization, CDN acceleration, caching strategies, distributed locks, optimistic DB updates, Redis atomic operations, message‑queue async processing, rate limiting, and other techniques to prevent overload, data loss, and unfair access.

dbaplus Community

Nov 11, 2021

How to Build a High‑Performance Flash Sale System for Massive Concurrency

1. Instant High Concurrency

During a flash‑sale the traffic spikes sharply a few minutes before the start time and reaches a peak at the exact second the sale begins, creating a burst of requests that lasts only a short period. Traditional systems struggle with this pattern, so a new architecture is required.

Key techniques to handle the burst include:

Page staticization

CDN acceleration

Caching

MQ asynchronous processing

Rate limiting

Distributed locking

2. Page Staticization

The activity page is the entry point and receives the highest request volume. Rendering it as a static page eliminates most server calls, leaving only the moment when a user clicks the flash‑sale button to hit the backend.

Static pages are served from a CDN, allowing users to fetch the content from the nearest edge node, reducing latency and network congestion.

3. Flash‑Sale Button Control

Before the sale starts the button is disabled (grey). At the exact start time a JavaScript file toggles the button to an active state. To avoid excessive clicks, a client‑side timer can limit requests (e.g., only one request per 10 seconds).

Static resources (CSS, JS, images) are cached on the CDN. The JS file is versioned with a random parameter so that each sale generates a new file, forcing the CDN to fetch the latest script.

4. Read‑Many‑Write‑Few Pattern

Flash‑sale logic first checks stock; only successful purchases write to the database. Because most requests read the same product, a cache (Redis) is used to avoid overwhelming the DB.

5. Cache Problems and Solutions

Cache breakdown : When many requests miss the cache simultaneously, the DB can be flooded. The solution is to use a distributed lock and pre‑warm the cache with all product data.

Cache penetration : Requests for nonexistent product IDs bypass the cache and hit the DB. A Bloom filter can quickly reject such IDs, and a “negative cache” can store the fact that an ID does not exist for a short TTL.

6. Stock Management

Two main approaches are shown.

Database optimistic lock – update only when stock > 0:

update product set stock = stock - 1 where id = 123 and stock > 0;

Redis atomic decrement using INCRBY:

boolean exist = redisClient.query(productId, userId);
if (exist) { return -1; }
int stock = redisClient.queryStock(productId);
if (stock <= 0) { return 0; }
redisClient.incrby(productId, -1);
redisClient.add(productId, userId);
return 1;

To avoid negative stock under high concurrency, a Lua script can perform the whole check‑and‑decrement atomically:

if (redis.call('exists', KEYS[1]) == 1) then
  local stock = tonumber(redis.call('get', KEYS[1]));
  if (stock == -1) then return 1; end;
  if (stock > 0) then
    redis.call('incrby', KEYS[1], -1);
    return stock;
  end;
  return 0;
end;
return -1;

7. Distributed Locking

Redis SETNX plus a separate EXPIRE is not atomic. The safer command is:

String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
return false;

Lock release must verify the owner:

if (jedis.get(lockKey).equals(requestId)) { jedis.del(lockKey); return true; }
return false;

A Lua script can make the check‑and‑delete atomic:

if redis.call('get', KEYS[1]) == ARGV[1] then
  return redis.call('del', KEYS[1])
else
  return 0
end

Self‑spinning locks repeatedly try SET with NX PX until a timeout, then give up.

long start = System.currentTimeMillis();
while (true) {
  String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
  if ("OK".equals(result)) { return true; }
  if (System.currentTimeMillis() - start >= timeout) { return false; }
  Thread.sleep(50);
}
finally { unlock(lockKey, requestId); }

8. MQ Asynchronous Processing

The flash‑sale flow is split: the high‑traffic “buy” step is kept fast, while order creation and payment are handled asynchronously via a message queue.

To avoid message loss, a “message‑send table” records each message before it is sent; the consumer updates the status after processing. A retry job re‑sends messages that remain in the pending state.

Duplicate consumption is prevented by a “message‑process table” that records processed message IDs within the same transaction as the order write.

Garbage messages caused by repeated retries are limited by a maximum retry count.

Unpaid orders are cancelled after a timeout using a delayed‑queue (e.g., RocketMQ delayed messages) that checks order status and rolls back stock if necessary.

9. Rate Limiting Strategies

To prevent bots from overwhelming the sale, several limits can be applied:

Per‑user limit (e.g., 5 requests per minute)

Per‑IP limit (e.g., 5 requests per minute)

Per‑API limit

CAPTCHA verification before the request

Business‑level gating (only members or high‑level users may participate)

Combining these techniques creates a flash‑sale system that can survive instant massive traffic, keep data consistent, and provide a fair experience for genuine users.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

caching high concurrency distributed-lock rate limiting flash sale

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.