How to Build a Robust Flash‑Sale System that Handles Millions of Requests
This article explores the challenges of designing a flash‑sale (秒杀) system—such as overselling, high concurrency, request flooding, URL protection, and database bottlenecks—and presents a complete backend architecture using Redis, Nginx, rate‑limiting, asynchronous order processing, and service degradation to achieve a stable, high‑throughput solution.
Problems to Consider in Flash‑Sale Systems
Overselling
When inventory is limited (e.g., 100 items) but the system sells more (e.g., 200), the business suffers a loss; preventing overselling is the top priority.
High Concurrency
Flash sales last only a few minutes and attract massive traffic, which can overwhelm caches, cause cache breakdown, and crash databases if not properly handled.
Interface Abuse
Automated scripts can send hundreds of requests per second; the system must filter repeated or invalid requests.
Flash‑Sale URL Exposure
Users may discover the underlying API URL via browser tools and trigger purchases directly; the URL should be dynamic and hidden until the sale starts.
Database Isolation
Running flash‑sale traffic on the same database as other services risks cascading failures; a dedicated flash‑sale database isolates the impact.
Massive Request Volume
Even with caching, a single Redis instance (≈40 k QPS) may be insufficient for tens or hundreds of thousands of concurrent users, leading to cache penetration and DB overload.
Design and Technical Solutions
Flash‑Sale Database Schema
A separate database with at least two tables—flash‑sale orders and flash‑sale products—prevents the main site from being affected. Additional tables for product details and user information are also recommended.
Dynamic Flash‑Sale URL
Generate the sale URL by MD5‑hashing a random string; the front‑end requests the URL from the back‑end, which validates it before allowing the purchase.
Static Page Rendering
Render product description, parameters, transaction records, images, and reviews into a static HTML page so the front‑end can serve content without hitting the back‑end or database, reducing server load. Technologies such as FreeMarker can be used.
Redis Cluster
Switch from a single Redis node to a clustered Redis (e.g., Sentinel mode) to improve performance and availability, mitigating cache‑breakdown risks.
Using Nginx
Deploy Nginx as a high‑performance reverse proxy; it can handle tens of thousands of concurrent connections and forward traffic to a Tomcat cluster, greatly increasing concurrency capacity.
SQL Optimization
Combine inventory check and decrement into a single UPDATE statement with optimistic locking (version field) to avoid double‑query overhead and reduce overselling risk.
Redis Pre‑Decrement
Before the sale starts, set the stock count in Redis. Each order atomically decrements the Redis key (using Lua scripts for atomicity). If an order is cancelled, the stock is incremented, ensuring consistency with the database.
Rate Limiting
Implement multiple layers of rate limiting:
Front‑end throttling : disable the purchase button for a few seconds after a click.
Per‑user repeat request block : reject requests from the same user within a configurable window (e.g., 10 s) using Redis key expiration.
Token‑bucket algorithm : use Guava's RateLimiter to generate tokens at a fixed rate; only requests that acquire a token are processed. Example code:
public class TestRateLimiter {
public static void main(String[] args) {
// 1 token per second
RateLimiter limiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
double wait = limiter.acquire();
System.out.println("Task " + i + " waited " + wait + " seconds");
}
System.out.println("Done");
}
}Try‑acquire with timeout : if a token cannot be obtained within a short timeout (e.g., 0.5 s), the request is discarded, preventing long‑running waits.
public class TestRateLimiter2 {
public static void main(String[] args) {
RateLimiter limiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
boolean ok = limiter.tryAcquire(500, TimeUnit.MILLISECONDS);
System.out.println("Task " + i + " allowed: " + ok);
if (!ok) continue;
System.out.println("Task " + i + " executing");
}
System.out.println("End");
}
}Asynchronous Order Processing
After passing rate limiting and stock validation, push the order request into a message queue (e.g., RabbitMQ). Consumers process orders asynchronously, decouple the front‑end from the database, and can send success notifications via SMS. Failed orders can be retried with compensation logic.
Service Degradation
If a server crashes or a service becomes unavailable, fallback mechanisms such as Hystrix circuit breakers provide graceful degradation, returning user‑friendly messages instead of hard errors.
Summary Diagram
This architecture can sustain hundreds of thousands of concurrent requests; for tens of millions, further scaling such as database sharding, Kafka queues, and larger Redis clusters would be required. The design demonstrates how to handle high concurrency, prevent overselling, and maintain system stability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
