How to Build a Robust Flash‑Sale System that Handles Millions of Requests
This article examines the core challenges of flash‑sale (秒杀) systems—such as overselling, extreme concurrency, bot traffic, and database strain—and presents a comprehensive backend design that includes dedicated databases, dynamic URLs, static pages, Redis clustering, Nginx load balancing, SQL optimization, rate‑limiting, asynchronous order queues, and service degradation strategies.
Key Challenges in Flash‑Sale Systems
Overselling : Limited inventory (e.g., 100 items) can be sold twice as much if not properly controlled, causing financial loss.
High Concurrency : Flash sales last only minutes but attract massive simultaneous requests, risking cache breakdown and database overload.
Bot / Script Attacks : Automated tools can fire hundreds of requests per second, requiring request validation.
URL Exposure : Skilled users can discover the backend API URL via browser dev tools and bypass front‑end controls.
Database Coupling : Running flash‑sale traffic on the same DB as normal services can cause cascading failures.
Massive Request Volume : Even a powerful cache like a single Redis instance (~40k QPS) may be insufficient for tens or hundreds of thousands of QPS.
Design and Technical Solutions
Dedicated Flash‑Sale Database
A separate database isolates flash‑sale traffic from the main business DB. Two essential tables are required: miaosha_order (order records) and miaosha_goods (product inventory). Additional tables for product details and user information can be added as needed.
Dynamic URL Generation
To prevent pre‑knowledge of the flash‑sale endpoint, the URL is generated dynamically using an MD5 hash of a random string. The front‑end first requests the generated URL; the backend validates the request before allowing the sale.
Page Staticization
Static HTML pages render product descriptions, parameters, transaction history, images, and reviews, eliminating database calls for read‑only content. Technologies such as FreeMarker can generate these pages from templates.
Redis Cluster (Sentinel Mode)
Flash sales are read‑heavy and benefit from Redis caching. To avoid cache breakdown, a Redis cluster with Sentinel provides higher availability and throughput.
Using Nginx as Front‑End Proxy
Nginx handles tens of thousands of concurrent connections, far exceeding Tomcat's capacity. It forwards requests to a Tomcat cluster, dramatically improving concurrency.
SQL Optimization with Optimistic Lock
Instead of a separate SELECT and UPDATE, a single UPDATE statement can decrement stock safely:
update miaosha_goods set stock = stock - 1 where goods_id = #{goods_id} and version = #{version} and stock > 0;This uses a version field for optimistic locking, offering better performance than pessimistic locks.
Redis Pre‑Decrement Stock
Before the sale starts, set the initial stock in Redis (e.g., redis.set(goodsId, 100)). Each order attempts to decrement the Redis key atomically (often via a Lua script). If the request is cancelled, the stock is incremented, ensuring the total never exceeds the original amount.
Rate Limiting Strategies
Front‑End Limiting : Disable the purchase button for a few seconds after a click.
Duplicate Request Blocking : Reject requests from the same user within a configurable interval (e.g., 10 seconds) using Redis key expiration.
Token‑Bucket Algorithm : Implement a token bucket with Guava's RateLimiter. Example:
public class TestRateLimiter {
public static void main(String[] args) {
// 1 token per second
final RateLimiter rateLimiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
double waitTime = rateLimiter.acquire();
System.out.println("Task " + i + " waited " + waitTime + " seconds");
}
System.out.println("Done");
}
}Another variant with timeout:
public class TestRateLimiter2 {
public static void main(String[] args) {
final RateLimiter rateLimiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
boolean ok = rateLimiter.tryAcquire(0.5, TimeUnit.SECONDS);
if (!ok) continue;
System.out.println("Task " + i + " executed");
}
System.out.println("Finished");
}
}With a token generation rate of 20 tokens/second and a request burst of 4 million, only a few thousand requests pass, demonstrating the algorithm's effectiveness.
Asynchronous Order Processing
To improve order throughput and avoid failures, place validated orders into a message queue (e.g., RabbitMQ). Consumers process the queue asynchronously, send SMS notifications on success, and apply compensation/retry mechanisms on failure.
Service Degradation (Circuit Breaker)
If a server crashes during the flash sale, a fallback service (e.g., using Hystrix) can return a friendly message instead of a hard error, preserving user experience.
Conclusion
The following flowchart illustrates the end‑to‑end flash‑sale process, capable of handling hundreds of thousands of concurrent requests. For traffic in the tens of millions, further scaling—such as database sharding, Kafka queues, and larger Redis clusters—would be required.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
