How to Build a High‑Performance Flash Sale System: 9 Essential Design Tips

This article explains how to design a flash‑sale (秒杀) system that can handle instant high concurrency by using page static‑generation, CDN acceleration, caching strategies, distributed locks, message‑queue async processing, stock pre‑deduction, rate limiting and other techniques to ensure reliability and prevent overselling.

Java Backend Technology
Java Backend Technology
Java Backend Technology
How to Build a High‑Performance Flash Sale System: 9 Essential Design Tips

Preface

How to design a flash‑sale system under high concurrency? This is a frequent interview question that looks simple but actually tests knowledge across the front‑end and back‑end.

Flash sales usually appear in promotional activities where a limited quantity of items (e.g., 10 phones) are sold at a very low price (e.g., 0.1 CNY). Most of these activities are not profitable for merchants; they are merely a marketing gimmick.

Although flash sales are just promotional events, the technical requirements are high. Below are nine key details to consider when designing a flash‑sale system.

1. Instant High Concurrency

During the few minutes before the flash‑sale start time, user traffic spikes dramatically and reaches its peak at the exact moment the sale begins.

Because many users compete for a few items, most requests will fail – a classic "many wolves, few sheep" scenario.

After most users receive a "sold out" notice, they quickly leave the page, causing the traffic peak to be very short. This creates an instant high‑concurrency situation, illustrated by the following traffic curve:

Traditional systems struggle with such spikes; a new design is required, focusing on:

Page static‑generation

CDN acceleration

Caching

MQ asynchronous processing

Rate limiting

Distributed locks

2. Page Static‑Generation

The activity page is the entry point with the highest traffic. Directly serving it from the backend would overwhelm the server.

Most page content (product name, description, images) is static, so we should static‑generate the page. Only when the user clicks the flash‑sale button at the exact time does the request reach the backend.

Because users are distributed across the country, we need a CDN to deliver the static page quickly.

3. Flash‑Sale Button

Before the sale starts, the button is greyed out and unclickable. It becomes active only at the exact start time.

Users often refresh the page repeatedly to catch the button as soon as it lights up.

Since the page is static, we control the button state with a JavaScript file that updates the button status at the sale moment.

Static resources (CSS, JS, images) are cached on the CDN for fast access.

When the sale begins, a new JS file with a random parameter is generated and synchronized to the CDN, preventing stale caching.

A client‑side timer can also limit requests (e.g., only one request per 10 seconds).

4. Read‑Heavy, Write‑Light

During a flash sale, the system first checks inventory; if sufficient, it proceeds to write to the database. Most users will find the inventory insufficient, so the write path is rarely executed.

This is a classic "read‑many, write‑few" scenario.

Directly querying the database under massive load can cause it to crash; therefore, a cache such as Redis should be used, with multiple nodes for high availability.

5. Cache Issues

Product information (ID, name, specs, stock) should be stored in Redis, while the database holds the source of truth.

When a user clicks the flash‑sale button, the service validates the product ID and then checks the cache. If the cache misses, the database is queried and the result is cached.

5.1 Cache Penetration

If many requests query a product ID that does not exist in both cache and database, the database can be overwhelmed.

Using a distributed lock mitigates the impact, but a better solution is a Bloom filter to pre‑check existence.

When the underlying data changes frequently, the Bloom filter must be kept in sync, which is difficult. In such cases, caching the negative result (i.e., "product does not exist") with a short TTL is advisable.

6. Stock Management

In flash sales, stock must be pre‑deducted. If the order is not paid within a certain period, the reserved stock must be released.

6.1 Database Stock Deduction

update product set stock=stock-1 where id=123;

To avoid overselling, the stock check and update must be atomic. Optimistic locking can be used:

update product set stock=stock-1 where id=123 and stock>0;

However, frequent DB access can cause connection exhaustion and deadlocks under high load.

6.2 Redis Stock Deduction

boolean exist = redisClient.query(productId,userId);
if (exist) { return -1; }
int stock = redisClient.queryStock(productId);
if (stock <= 0) { return 0; }
redisClient.incrby(productId, -1);
redisClient.add(productId,userId);
return 1;

This approach suffers from race conditions that may produce negative stock.

6.3 Lua Script Stock Deduction

StringBuilder lua = new StringBuilder();
lua.append("if (redis.call('exists', KEYS[1]) == 1) then");
lua.append("    local stock = tonumber(redis.call('get', KEYS[1]));");
lua.append("    if (stock == -1) then");
lua.append("        return 1;
    end;");
lua.append("    if (stock > 0) then");
lua.append("        redis.call('incrby', KEYS[1], -1);");
lua.append("        return stock;");
lua.append("    end;");
lua.append("    return 0;");
lua.append("end;");
lua.append("return -1;");

The Lua script runs atomically in Redis, handling existence checks, unlimited stock (-1), normal deduction, and out‑of‑stock cases.

7. Distributed Locks

When many requests miss the cache and hit the database simultaneously, the database can crash. A Redis distributed lock prevents this.

7.1 setNx Lock

if (jedis.setnx(lockKey, val) == 1) {
    jedis.expire(lockKey, timeout);
}

setNx and expire are not atomic; a failure between them can leave a permanent lock.

7.2 set with NX PX

String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
return false;

This command is atomic.

7.3 Unlock

if (jedis.get(lockKey).equals(requestId)) {
    jedis.del(lockKey);
    return true;
}
return false;

Using a Lua script makes the check‑and‑delete atomic.

if redis.call('get', KEYS[1]) == ARGV[1] then
    return redis.call('del', KEYS[1])
else
    return 0
end

7.4 Spin Lock

try {
    Long start = System.currentTimeMillis();
    while (true) {
        String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
        if ("OK".equals(result)) { return true; }
        long time = System.currentTimeMillis() - start;
        if (time >= timeout) { return false; }
        Thread.sleep(50);
    }
} finally {
    unlock(lockKey, requestId);
}
return false;

7.5 Redisson

Redisson addresses lock competition, renewal, re‑entrancy, and multi‑node scenarios. (Detailed usage omitted for brevity.)

8. MQ Asynchronous Processing

In a flash sale, the three core steps are: flash‑sale request, order creation, and payment. The order step has low concurrency and should be processed asynchronously via a message queue.

8.1 Message Loss

If sending a message fails (network, broker crash, disk error), the order may be lost. A "message send table" records each message with a status of "pending" before sending. After successful consumption, the status is updated to "processed".

If sending fails after the record is inserted, a retry job periodically re‑sends pending messages.

8.2 Duplicate Consumption

To avoid processing the same message twice, a "message processing table" records processed message IDs. Consumers check this table before handling a message; the order creation and table insert must be in the same transaction.

8.3 Garbage Messages

If a message repeatedly fails, the retry job may generate many useless messages. Limit the number of resend attempts in the send table; once the limit is reached, stop retrying.

8.4 Delayed Consumption

Orders not paid within 15 minutes should be cancelled and stock restored. Instead of a periodic job, use a delayed queue (e.g., RocketMQ's delay feature). When the delay expires, the consumer checks the order status and cancels if still pending.

9. Rate Limiting

To prevent bots from overwhelming the flash‑sale API, several rate‑limiting strategies are used:

Per‑user limit (e.g., 5 requests per minute)

Per‑IP limit (e.g., 5 requests per minute)

Per‑endpoint limit (global request cap)

CAPTCHA verification (including sliding‑puzzle CAPTCHAs)

Business‑level restrictions (e.g., only members or high‑level users can participate)

Each method balances fairness, user experience, and system stability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed-lockflash sale
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.