How to Build a High‑Performance Flash Sale System: 9 Essential Design Tips
This article explains how to design a flash‑sale (秒杀) system that can handle instant high concurrency by using page static‑generation, CDN acceleration, caching strategies, distributed locks, message‑queue async processing, stock pre‑deduction, rate limiting and other techniques to ensure reliability and prevent overselling.
Preface
How to design a flash‑sale system under high concurrency? This is a frequent interview question that looks simple but actually tests knowledge across the front‑end and back‑end.
Flash sales usually appear in promotional activities where a limited quantity of items (e.g., 10 phones) are sold at a very low price (e.g., 0.1 CNY). Most of these activities are not profitable for merchants; they are merely a marketing gimmick.
Although flash sales are just promotional events, the technical requirements are high. Below are nine key details to consider when designing a flash‑sale system.
1. Instant High Concurrency
During the few minutes before the flash‑sale start time, user traffic spikes dramatically and reaches its peak at the exact moment the sale begins.
Because many users compete for a few items, most requests will fail – a classic "many wolves, few sheep" scenario.
After most users receive a "sold out" notice, they quickly leave the page, causing the traffic peak to be very short. This creates an instant high‑concurrency situation, illustrated by the following traffic curve:
Traditional systems struggle with such spikes; a new design is required, focusing on:
Page static‑generation
CDN acceleration
Caching
MQ asynchronous processing
Rate limiting
Distributed locks
2. Page Static‑Generation
The activity page is the entry point with the highest traffic. Directly serving it from the backend would overwhelm the server.
Most page content (product name, description, images) is static, so we should static‑generate the page. Only when the user clicks the flash‑sale button at the exact time does the request reach the backend.
Because users are distributed across the country, we need a CDN to deliver the static page quickly.
3. Flash‑Sale Button
Before the sale starts, the button is greyed out and unclickable. It becomes active only at the exact start time.
Users often refresh the page repeatedly to catch the button as soon as it lights up.
Since the page is static, we control the button state with a JavaScript file that updates the button status at the sale moment.
Static resources (CSS, JS, images) are cached on the CDN for fast access.
When the sale begins, a new JS file with a random parameter is generated and synchronized to the CDN, preventing stale caching.
A client‑side timer can also limit requests (e.g., only one request per 10 seconds).
4. Read‑Heavy, Write‑Light
During a flash sale, the system first checks inventory; if sufficient, it proceeds to write to the database. Most users will find the inventory insufficient, so the write path is rarely executed.
This is a classic "read‑many, write‑few" scenario.
Directly querying the database under massive load can cause it to crash; therefore, a cache such as Redis should be used, with multiple nodes for high availability.
5. Cache Issues
Product information (ID, name, specs, stock) should be stored in Redis, while the database holds the source of truth.
When a user clicks the flash‑sale button, the service validates the product ID and then checks the cache. If the cache misses, the database is queried and the result is cached.
5.1 Cache Penetration
If many requests query a product ID that does not exist in both cache and database, the database can be overwhelmed.
Using a distributed lock mitigates the impact, but a better solution is a Bloom filter to pre‑check existence.
When the underlying data changes frequently, the Bloom filter must be kept in sync, which is difficult. In such cases, caching the negative result (i.e., "product does not exist") with a short TTL is advisable.
6. Stock Management
In flash sales, stock must be pre‑deducted. If the order is not paid within a certain period, the reserved stock must be released.
6.1 Database Stock Deduction
update product set stock=stock-1 where id=123;To avoid overselling, the stock check and update must be atomic. Optimistic locking can be used:
update product set stock=stock-1 where id=123 and stock>0;However, frequent DB access can cause connection exhaustion and deadlocks under high load.
6.2 Redis Stock Deduction
boolean exist = redisClient.query(productId,userId);
if (exist) { return -1; }
int stock = redisClient.queryStock(productId);
if (stock <= 0) { return 0; }
redisClient.incrby(productId, -1);
redisClient.add(productId,userId);
return 1;This approach suffers from race conditions that may produce negative stock.
6.3 Lua Script Stock Deduction
StringBuilder lua = new StringBuilder();
lua.append("if (redis.call('exists', KEYS[1]) == 1) then");
lua.append(" local stock = tonumber(redis.call('get', KEYS[1]));");
lua.append(" if (stock == -1) then");
lua.append(" return 1;
end;");
lua.append(" if (stock > 0) then");
lua.append(" redis.call('incrby', KEYS[1], -1);");
lua.append(" return stock;");
lua.append(" end;");
lua.append(" return 0;");
lua.append("end;");
lua.append("return -1;");The Lua script runs atomically in Redis, handling existence checks, unlimited stock (-1), normal deduction, and out‑of‑stock cases.
7. Distributed Locks
When many requests miss the cache and hit the database simultaneously, the database can crash. A Redis distributed lock prevents this.
7.1 setNx Lock
if (jedis.setnx(lockKey, val) == 1) {
jedis.expire(lockKey, timeout);
}setNx and expire are not atomic; a failure between them can leave a permanent lock.
7.2 set with NX PX
String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
return false;This command is atomic.
7.3 Unlock
if (jedis.get(lockKey).equals(requestId)) {
jedis.del(lockKey);
return true;
}
return false;Using a Lua script makes the check‑and‑delete atomic.
if redis.call('get', KEYS[1]) == ARGV[1] then
return redis.call('del', KEYS[1])
else
return 0
end7.4 Spin Lock
try {
Long start = System.currentTimeMillis();
while (true) {
String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
long time = System.currentTimeMillis() - start;
if (time >= timeout) { return false; }
Thread.sleep(50);
}
} finally {
unlock(lockKey, requestId);
}
return false;7.5 Redisson
Redisson addresses lock competition, renewal, re‑entrancy, and multi‑node scenarios. (Detailed usage omitted for brevity.)
8. MQ Asynchronous Processing
In a flash sale, the three core steps are: flash‑sale request, order creation, and payment. The order step has low concurrency and should be processed asynchronously via a message queue.
8.1 Message Loss
If sending a message fails (network, broker crash, disk error), the order may be lost. A "message send table" records each message with a status of "pending" before sending. After successful consumption, the status is updated to "processed".
If sending fails after the record is inserted, a retry job periodically re‑sends pending messages.
8.2 Duplicate Consumption
To avoid processing the same message twice, a "message processing table" records processed message IDs. Consumers check this table before handling a message; the order creation and table insert must be in the same transaction.
8.3 Garbage Messages
If a message repeatedly fails, the retry job may generate many useless messages. Limit the number of resend attempts in the send table; once the limit is reached, stop retrying.
8.4 Delayed Consumption
Orders not paid within 15 minutes should be cancelled and stock restored. Instead of a periodic job, use a delayed queue (e.g., RocketMQ's delay feature). When the delay expires, the consumer checks the order status and cancels if still pending.
9. Rate Limiting
To prevent bots from overwhelming the flash‑sale API, several rate‑limiting strategies are used:
Per‑user limit (e.g., 5 requests per minute)
Per‑IP limit (e.g., 5 requests per minute)
Per‑endpoint limit (global request cap)
CAPTCHA verification (including sliding‑puzzle CAPTCHAs)
Business‑level restrictions (e.g., only members or high‑level users can participate)
Each method balances fairness, user experience, and system stability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
