How to Build a High‑Performance Flash Sale System: 9 Essential Design Tips
Designing a flash‑sale (秒杀) system for massive concurrent users requires careful handling of instant traffic spikes, page staticization, CDN acceleration, caching strategies, distributed locks, rate limiting, asynchronous processing, and reliable stock management, with nine detailed techniques to ensure stability and prevent overselling.
Preface
How to design a flash‑sale (秒杀) system under high concurrency? This common interview question looks simple but actually tests knowledge from front‑end to back‑end in high‑traffic scenarios.
Flash sales appear in e‑commerce promotional activities where a limited quantity of goods (e.g., 10 smartphones) is offered at an extremely low price (e.g., 0.1 CNY). Only a few users can purchase successfully, making the technical requirements demanding.
Below are nine key details to consider when designing a flash‑sale system.
1. Instant High Concurrency
In the minutes before the scheduled flash‑sale time (e.g., 12:00), user traffic surges dramatically and peaks at the exact moment.
Because the scenario is "many users, few items" (狼多肉少), most users will fail, and the system experiences a very short burst of peak traffic.
Traditional systems struggle with this pattern; we need a new architecture that can handle the spike.
Page staticization
CDN acceleration
Caching
MQ asynchronous processing
Rate limiting
Distributed lock
2. Page Staticization
The activity page is the first entry point and receives the highest traffic. Directly serving every request from the back‑end would overwhelm the server.
Most page content (product name, description, images) is static, so we should generate a static version of the page and only invoke the back‑end when the user clicks the flash‑sale button at the exact time.
Because users are geographically distributed, we need a CDN to deliver the static page quickly.
3. Flash‑Sale Button
Before the sale starts, the button is disabled (grey). It becomes clickable only at the exact sale moment, which forces users to wait for the activation.
We control the button state with a JavaScript file that is cached on the CDN for performance.
4. Read‑Heavy Write‑Light
During the sale, the system first checks inventory; if sufficient, it proceeds to create an order and write to the database. Otherwise, it returns "sold out".
This is a classic read‑heavy/write‑light scenario, best handled with a cache such as Redis rather than direct database queries.
5. Cache Issues
Product information (id, name, specs, stock) should be stored in Redis while the database holds the source of truth.
When a request arrives, the flow is: query Redis → if miss, query DB → populate Redis → proceed; if not found in DB, return failure.
5.1 Cache Penetration
If many concurrent requests query a product that is not in cache, they all hit the database, potentially causing a crash.
The solution is to use a distributed lock (e.g., Redis lock) and pre‑warm the cache with all product data.
5.2 Cache Miss (Cache Avalanche)
When a product is missing from cache, simultaneous DB queries can overwhelm the DB. Using a lock or a Bloom filter can mitigate this.
5.3 Storing Negative Results
Cache the fact that a product does not exist with a short TTL to avoid repeated DB hits.
6. Stock Management
Beyond simple decrement, we need a pre‑deduction (预扣库存) mechanism that can roll back stock if payment is not completed within a timeout.
6.1 Database Stock Decrement
Simple SQL: update product set stock=stock-1 where id=123; To avoid overselling, the check and update must be atomic, e.g., using an optimistic lock:
update product set stock=stock-1 where id=product and stock>0;6.2 Redis Stock Decrement
Redis incrby is atomic. Pseudocode:
boolean exist = redisClient.query(productId,userId);
if (exist) { return -1; }
int stock = redisClient.queryStock(productId);
if (stock <= 0) { return 0; }
redisClient.incrby(productId, -1);
redisClient.add(productId,userId);
return 1;6.3 Lua Script for Atomic Decrement
Lua guarantees atomicity:
StringBuilder lua = new StringBuilder();
lua.append("if (redis.call('exists', KEYS[1]) == 1) then");
lua.append(" local stock = tonumber(redis.call('get', KEYS[1]));");
lua.append(" if (stock == -1) then return 1; end;");
lua.append(" if (stock > 0) then redis.call('incrby', KEYS[1], -1); return stock; end;");
lua.append(" return 0; end; return -1;");7. Distributed Lock
When many requests miss the cache, they all hit the DB. A Redis distributed lock prevents this.
7.1 setNx Lock
if (jedis.setnx(lockKey, val) == 1) {
jedis.expire(lockKey, timeout);
}Because setting the expiration is separate, it is not atomic.
7.2 set with NX PX
String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
return false;This command is atomic.
7.3 Unlock
if (jedis.get(lockKey).equals(requestId)) {
jedis.del(lockKey);
return true;
}
return false;Using a Lua script makes the check‑and‑delete atomic:
if redis.call('get', KEYS[1]) == ARGV[1] then
return redis.call('del', KEYS[1])
else
return 0
end7.4 Spin Lock
Repeatedly try set with NX/PX until timeout, sleeping briefly between attempts.
try {
long start = System.currentTimeMillis();
while (true) {
String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);
if ("OK".equals(result)) { return true; }
if (System.currentTimeMillis() - start >= timeout) { return false; }
Thread.sleep(50);
}
} finally {
unlock(lockKey, requestId);
}7.5 Redisson
Redisson abstracts these details and solves lock re‑entrancy, renewal, and multi‑node issues.
8. MQ Asynchronous Processing
The flash‑sale flow has three core steps: request, order creation, and payment. Only the request step needs ultra‑high concurrency; order creation and payment can be handled asynchronously via MQ.
8.1 Message Loss
Use a message‑sending table to record pending messages before sending to MQ; after successful consumption, update the status.
8.2 Duplicate Consumption
Maintain a message‑processing table; before processing, check if the message was already handled.
8.3 Garbage Messages
Limit retry attempts in the sending table; discard after reaching a maximum count.
8.4 Delayed Consumption
Use a delayed queue (e.g., RocketMQ) to automatically cancel unpaid orders after a timeout.
9. Rate Limiting
To prevent bots from overwhelming the flash‑sale interface, apply rate‑limiting strategies:
Limit per user (e.g., 5 requests per minute).
Limit per IP.
Limit per API endpoint.
Introduce captchas (including sliding‑puzzle captchas).
Raise business thresholds (e.g., members‑only, higher user level).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
