Designing a High‑Concurrency Flash Sale System: Architecture, Challenges & Solutions
This article dissects the architecture of a flash‑sale (秒杀) system, outlining the typical e‑commerce flow, the unique characteristics of flash sales, the technical challenges of massive concurrent requests, and detailed solutions spanning isolation, static page delivery, CDN caching, dynamic URLs, request throttling, queue design, database sharding, caching strategies, overload protection, anti‑cheat mechanisms, and data safety techniques.
Flash Sale (秒杀) Overview
Typical e‑commerce order flow: query product → create order → deduct inventory → update order → payment → shipment.
Flash‑sale characteristics: low price, heavy promotion, instant sell‑out, scheduled launch, short duration, and extremely high concurrent requests (e.g., 10,000 users for a single product).
Technical Challenges
Sudden load on existing services, application servers and databases.
Network and bandwidth spikes (e.g., a 200 KB product page × 10 000 concurrent users ≈ 2 GB traffic).
Risk of exposing the order URL before the sale starts.
Need to ensure that only the first successful request reaches the order subsystem.
Potential overload of the order‑submission endpoint.
Solution Architecture
Isolation
Deploy the flash‑sale service in a separate domain or cluster to avoid impacting the main site.
Static Front‑end
Render the flash‑sale page as a static HTML file (≈200 KB) and serve it from a CDN or reverse‑proxy cache. The purchase button is initially disabled.
At the sale start, a tiny JavaScript file (few hundred bytes) is updated with a flag saleStarted = true and a server‑generated token. The client polls a lightweight JSON endpoint for server time to keep the countdown accurate.
Dynamic Order URL
The order URL contains a random token generated by the server at the exact start moment. Requests without a valid token are rejected, preventing premature purchases.
Request Throttling
Each front‑end server accepts only a limited number of order requests (e.g., 10 per instance).
Use least‑connection load balancing or cookie‑based routing to spread traffic evenly.
Reject requests when the global sold count exceeds inventory.
Pre‑check Logic
Before enqueuing a request, the server checks:
If the local instance has already processed the maximum allowed requests.
If the total number of successful orders has reached the inventory limit.
If either condition is true, the user receives a “sale ended” page.
Layered Design
Frontend Layer
Static page with countdown timer; static resources (HTML, CSS, JS, images) are cached on CDN. Time synchronization is performed via a fast JSON endpoint that returns the current server timestamp.
Site Layer
Intercepts requests at the edge: disables the purchase button, limits each UID or IP to one request per configurable interval, and caches identical requests for a short window.
Service Layer
Uses Nginx/Apache for request distribution. A pre‑processing module checks inventory via RPC and pushes valid requests into a ConcurrentLinkedQueue. A fixed‑size worker thread pool consumes the queue and writes successful bids to a persistent store.
Database Layer
Successful bids are placed into an ArrayBlockingQueue before being persisted. The database module exposes a single RPC interface for inventory checks and order insertion, ensuring a clear separation of concerns.
Concurrency Control & Anti‑Cheat
Limit one order per account using Redis WATCH for optimistic locking.
Detect high‑frequency IPs and present CAPTCHAs or block them.
Apply participation thresholds (e.g., account level, activity) to filter out “zombie” accounts.
Use FIFO queues or pessimistic locks to prevent overselling.
Code Samples
package seckill;
import org.apache.http.HttpRequest;
/**
* Pre‑processing stage: reject unnecessary requests, queue valid ones.
*/
public class PreProcessor {
private static boolean reminds = true;
private static void forbidden() {
// Return sale‑ended response.
}
public static boolean checkReminds() {
// Remote RPC to DB to verify remaining stock.
if (!RPC.checkReminds()) {
reminds = false;
}
return reminds;
}
public static void preProcess(HttpRequest request) {
if (checkReminds()) {
RequestQueue.queue.add(request);
} else {
forbidden();
}
}
} package seckill;
import java.util.concurrent.ConcurrentLinkedQueue;
public class RequestQueue {
public static ConcurrentLinkedQueue<HttpRequest> queue = new ConcurrentLinkedQueue<HttpRequest>();
} package seckill;
import org.apache.http.HttpRequest;
public class Processor {
/** Send a flash‑sale transaction to the DB queue. */
public static void kill(BidInfo info) {
DB.bids.add(info);
}
public static void process() {
BidInfo info = new BidInfo(RequestQueue.queue.poll());
if (info != null) {
kill(info);
}
}
} package seckill;
import java.util.concurrent.ArrayBlockingQueue;
public class DB {
public static int count = 10; // total inventory
public static ArrayBlockingQueue<BidInfo> bids = new ArrayBlockingQueue<BidInfo>(10);
public static boolean checkReminds() {
// Real implementation should query the inventory table.
return count > 0;
}
public static void bid() {
BidInfo info = bids.poll();
while (info != null && count-- > 0) {
// INSERT INTO bids (item_id, user_id, ...) VALUES (...);
info = bids.poll();
}
}
}Database Design & Scaling
Modern high‑traffic systems combine:
Single‑instance for low‑latency reads.
Horizontal sharding (range or hash) to split data across multiple databases.
Replication groups (master‑slave) for high availability.
Routing can be performed by a dedicated router service that maps a key (e.g., user ID) to the appropriate shard.
High‑Concurrency Considerations
Estimate QPS: with 20 web servers each handling 500 concurrent connections and a 100 ms response time, theoretical peak ≈ 100 000 QPS. In practice response time grows under load, reducing effective QPS.
Use in‑memory stores (Redis, Memcached) for fast inventory checks and token storage.
Apply overload protection at the entry point (e.g., return 503 when CPU or connection count exceeds thresholds).
Data Safety & Overselling Prevention
Three common strategies:
Pessimistic lock : lock the inventory row before decrementing (high contention, may degrade QPS).
FIFO queue : serialize order processing, but the queue must be sized to avoid memory exhaustion.
Optimistic lock (preferred): use a version number or Redis WATCH to ensure the inventory has not changed between read and write. If the check fails, the request is rejected.
Conclusion
Flash‑sale systems require isolation from the main site, aggressive caching of static assets, a lightweight request pipeline, and strict concurrency control. By combining CDN‑served static pages, token‑based dynamic URLs, request throttling, Redis‑backed optimistic locking, and a small worker pool that persists successful bids, a robust and scalable flash‑sale service can handle tens of thousands of concurrent users while preventing overselling and mitigating cheating attempts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
