Backend Development 26 min read

How to Build a High‑Performance Flash Sale System: 9 Essential Design Tips

Designing a flash‑sale (秒杀) system for massive concurrent users requires careful handling of instant traffic spikes, page staticization, CDN acceleration, caching strategies, distributed locks, rate limiting, asynchronous processing, and reliable stock management, with nine detailed techniques to ensure stability and prevent overselling.

Su San Talks Tech

Jul 29, 2021

How to Build a High‑Performance Flash Sale System: 9 Essential Design Tips

Preface

How to design a flash‑sale (秒杀) system under high concurrency? This common interview question looks simple but actually tests knowledge from front‑end to back‑end in high‑traffic scenarios.

Flash sales appear in e‑commerce promotional activities where a limited quantity of goods (e.g., 10 smartphones) is offered at an extremely low price (e.g., 0.1 CNY). Only a few users can purchase successfully, making the technical requirements demanding.

Below are nine key details to consider when designing a flash‑sale system.

1. Instant High Concurrency

In the minutes before the scheduled flash‑sale time (e.g., 12:00), user traffic surges dramatically and peaks at the exact moment.

Because the scenario is "many users, few items" (狼多肉少), most users will fail, and the system experiences a very short burst of peak traffic.

Traditional systems struggle with this pattern; we need a new architecture that can handle the spike.

Page staticization

CDN acceleration

Caching

MQ asynchronous processing

Rate limiting

Distributed lock

2. Page Staticization

The activity page is the first entry point and receives the highest traffic. Directly serving every request from the back‑end would overwhelm the server.

Most page content (product name, description, images) is static, so we should generate a static version of the page and only invoke the back‑end when the user clicks the flash‑sale button at the exact time.

Because users are geographically distributed, we need a CDN to deliver the static page quickly.

3. Flash‑Sale Button

Before the sale starts, the button is disabled (grey). It becomes clickable only at the exact sale moment, which forces users to wait for the activation.

We control the button state with a JavaScript file that is cached on the CDN for performance.

4. Read‑Heavy Write‑Light

During the sale, the system first checks inventory; if sufficient, it proceeds to create an order and write to the database. Otherwise, it returns "sold out".

This is a classic read‑heavy/write‑light scenario, best handled with a cache such as Redis rather than direct database queries.

5. Cache Issues

Product information (id, name, specs, stock) should be stored in Redis while the database holds the source of truth.

When a request arrives, the flow is: query Redis → if miss, query DB → populate Redis → proceed; if not found in DB, return failure.

5.1 Cache Penetration

If many concurrent requests query a product that is not in cache, they all hit the database, potentially causing a crash.

The solution is to use a distributed lock (e.g., Redis lock) and pre‑warm the cache with all product data.

5.2 Cache Miss (Cache Avalanche)

When a product is missing from cache, simultaneous DB queries can overwhelm the DB. Using a lock or a Bloom filter can mitigate this.

5.3 Storing Negative Results

Cache the fact that a product does not exist with a short TTL to avoid repeated DB hits.

6. Stock Management

Beyond simple decrement, we need a pre‑deduction (预扣库存) mechanism that can roll back stock if payment is not completed within a timeout.

6.1 Database Stock Decrement

Simple SQL: update product set stock=stock-1 where id=123; To avoid overselling, the check and update must be atomic, e.g., using an optimistic lock:

update product set stock=stock-1 where id=product and stock>0;

6.2 Redis Stock Decrement

Redis incrby is atomic. Pseudocode:

boolean exist = redisClient.query(productId,userId);
if (exist) { return -1; }
int stock = redisClient.queryStock(productId);
if (stock <= 0) { return 0; }
redisClient.incrby(productId, -1);
redisClient.add(productId,userId);
return 1;

6.3 Lua Script for Atomic Decrement

Lua guarantees atomicity:

StringBuilder lua = new StringBuilder();
lua.append("if (redis.call('exists', KEYS[1]) == 1) then");
lua.append("    local stock = tonumber(redis.call('get', KEYS[1]));");
lua.append("    if (stock == -1) then return 1; end;");
lua.append("    if (stock > 0) then redis.call('incrby', KEYS[1], -1); return stock; end;");
lua.append("    return 0; end; return -1;");

7. Distributed Lock

When many requests miss the cache, they all hit the DB. A Redis distributed lock prevents this.

7.1 setNx Lock

if (jedis.setnx(lockKey, val) == 1) {
    jedis.expire(lockKey, timeout);
}

Because setting the expiration is separate, it is not atomic.

7.2 set with NX PX

String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
return false;

This command is atomic.

7.3 Unlock

if (jedis.get(lockKey).equals(requestId)) {
    jedis.del(lockKey);
    return true;
}
return false;

Using a Lua script makes the check‑and‑delete atomic:

if redis.call('get', KEYS[1]) == ARGV[1] then
    return redis.call('del', KEYS[1])
else
    return 0
end

7.4 Spin Lock

Repeatedly try set with NX/PX until timeout, sleeping briefly between attempts.

try {
    long start = System.currentTimeMillis();
    while (true) {
        String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
        if ("OK".equals(result)) { return true; }
        if (System.currentTimeMillis() - start >= timeout) { return false; }
        Thread.sleep(50);
    }
} finally {
    unlock(lockKey, requestId);
}

7.5 Redisson

Redisson abstracts these details and solves lock re‑entrancy, renewal, and multi‑node issues.

8. MQ Asynchronous Processing

The flash‑sale flow has three core steps: request, order creation, and payment. Only the request step needs ultra‑high concurrency; order creation and payment can be handled asynchronously via MQ.

8.1 Message Loss

Use a message‑sending table to record pending messages before sending to MQ; after successful consumption, update the status.

8.2 Duplicate Consumption

Maintain a message‑processing table; before processing, check if the message was already handled.

8.3 Garbage Messages

Limit retry attempts in the sending table; discard after reaching a maximum count.

8.4 Delayed Consumption

Use a delayed queue (e.g., RocketMQ) to automatically cancel unpaid orders after a timeout.

9. Rate Limiting

To prevent bots from overwhelming the flash‑sale interface, apply rate‑limiting strategies:

Limit per user (e.g., 5 requests per minute).

Limit per IP.

Limit per API endpoint.

Introduce captchas (including sliding‑puzzle captchas).

Raise business thresholds (e.g., members‑only, higher user level).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development caching high concurrency distributed-lock rate limiting flash sale

Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.