How to Build a Robust High‑Concurrency Flash Sale System
This article examines the challenges of implementing a flash‑sale (秒杀) system—such as overselling, massive concurrency, request flooding, URL exposure, and database strain—and presents a comprehensive backend design that includes dedicated databases, dynamic URLs, static page rendering, Redis clustering, Nginx load balancing, optimized SQL, token‑bucket rate limiting, asynchronous order processing, and service degradation strategies.
Flash‑sale (秒杀) systems, like those used by major e‑commerce platforms, must handle extremely short sale windows and massive request bursts. The article first outlines the key problems to address:
Overselling : Limited inventory (e.g., 100 items) can be sold far beyond stock if not properly controlled.
High concurrency : Thousands of users attempt to purchase within minutes, risking cache breakdown and database overload.
Interface abuse : Automated scripts can repeatedly hit the backend, requiring anti‑scraping measures.
URL exposure : Fixed URLs allow users to bypass front‑end controls.
Database coupling : Mixing flash‑sale traffic with normal business traffic can cause cascading failures.
Massive request volume : Even with caching, a single flash‑sale may generate hundreds of thousands of QPS, overwhelming a single Redis instance.
Design and Technical Solutions
Separate flash‑sale database : Create isolated tables (order and product) to prevent the high‑load activity from affecting other services. Additional tables for product details and user information are recommended.
Dynamic URL generation : Use an MD5 hash of a random string to create unpredictable URLs. The front‑end first requests the generated URL from the backend, which validates it before allowing the purchase.
Static page rendering : Render product details, images, and reviews into a static HTML page (e.g., via FreeMarker) so that user requests do not hit the application server or database.
Redis cluster : Deploy Redis in Sentinel or cluster mode to handle high read traffic and avoid cache penetration. Use Lua scripts for atomic stock decrement.
Nginx front‑end : Place Nginx before Tomcat to offload connection handling; Nginx can manage tens of thousands of concurrent connections.
SQL optimization : Combine stock check and decrement into a single UPDATE statement with optimistic locking (version field) to avoid double queries.
Pre‑decrement stock in Redis : Initialize stock in Redis (e.g., redis.set(goodsId, 100)) and decrement atomically on each order, falling back to the database only when necessary.
Rate Limiting and Request Filtering
Two layers of rate limiting are described:
Front‑end throttling : Disable the purchase button for a few seconds after a click.
Per‑user repeat request blocking : Use Redis keys with a short TTL (e.g., 10 s) to reject repeated submissions from the same user.
For backend throttling, the token‑bucket algorithm is implemented with Guava’s RateLimiter. Example code:
public class TestRateLimiter {
public static void main(String[] args) {
// 1 token per second
RateLimiter rateLimiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
double waitTime = rateLimiter.acquire();
System.out.println("Task " + i + " wait time " + waitTime);
}
System.out.println("Finished");
}
}A second example shows tryAcquire with a timeout, allowing the system to discard requests that cannot obtain a token within 0.5 s:
public class TestRateLimiter2 {
public static void main(String[] args) {
RateLimiter rateLimiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
boolean isValid = rateLimiter.tryAcquire(0.5, TimeUnit.SECONDS);
if (!isValid) continue;
System.out.println("Task " + i + " executed");
}
System.out.println("End");
}
}Performance analysis shows that with 4 million concurrent requests, a token generation rate of 20 tokens/s and a 0.05 s acquisition timeout permits only a few requests to pass, effectively protecting downstream services.
Asynchronous Order Processing and Service Degradation
After rate limiting and stock validation, orders are placed onto a message queue (e.g., RabbitMQ) for asynchronous processing, which smooths spikes and decouples the order service. Successful orders can trigger SMS notifications; failures can be retried via compensation logic.
If a server crashes during the flash‑sale, a fallback service (e.g., using Hystrix) should provide a graceful error message instead of a hard failure.
Summary
The proposed architecture—isolated database, dynamic URLs, static page rendering, Redis cluster, Nginx front‑end, optimized SQL, token‑bucket rate limiting, asynchronous queuing, and graceful degradation—can sustain hundreds of thousands of requests per second. For larger scales (tens of millions), further measures such as database sharding, Kafka queues, and larger Redis clusters would be required.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
