Design and Technical Solutions for a High‑Concurrency Flash Sale (秒杀) System
This article examines the challenges of building a robust flash‑sale (秒杀) system—such as overselling, high concurrency, request throttling, and database design—and presents a comprehensive backend architecture using Redis clusters, Nginx, token‑bucket rate limiting, asynchronous order processing, and other optimization techniques.
Flash‑sale (秒杀) systems like those on JD, Taobao, or Xiaomi attract massive traffic in a very short time, which raises critical questions about how to design a backend that can handle overselling, high concurrency, request abuse, and database pressure.
Key issues to consider include:
Overselling: limited stock (e.g., 100 items) must not be sold beyond availability.
High concurrency: millions of requests may arrive within minutes, risking cache breakdown and database overload.
Interface abuse: bots can repeatedly call the sale API, so request validation is essential.
Predictable URLs: exposing the sale URL allows users to bypass the front‑end; URLs should be dynamic and hidden until the sale starts.
Database isolation: the flash‑sale workload should not affect other business services.
Massive request volume: a single Redis instance may handle ~40k QPS, but flash sales can generate hundreds of thousands of QPS, requiring clustering.
Design and technical solutions :
2.1 Flash‑sale database design
A dedicated flash‑sale database isolates the high‑traffic workload. Two core tables are required: miaosha_order (order records) and miaosha_goods (product information). Additional tables for product details and user information can be linked via goods_id and user_id.
2.2 Dynamic flash‑sale URL
The sale URL is generated by MD5‑hashing a random string and is fetched from the backend only after the start time, preventing pre‑knowledge of the endpoint.
2.3 Page staticization
All product details (description, parameters, reviews, images) are rendered into a static HTML page using a template engine (e.g., FreeMarker), eliminating backend calls for each request.
2.4 Redis cluster
Because flash‑sale traffic is read‑heavy, Redis is used as a cache. To avoid cache breakdown, a Redis Sentinel cluster is deployed, improving performance and availability.
2.5 Nginx as a front‑end load balancer
Nginx can handle tens of thousands of concurrent connections, forwarding traffic to a Tomcat cluster, thereby greatly increasing overall concurrency.
2.6 SQL optimization
Instead of a separate SELECT and UPDATE for stock deduction, a single UPDATE with optimistic‑lock versioning is used:
update miaosha_goods set stock = stock - 1 where goods_id = #{goods_id} and version = #{version} and stock > 0;2.7 Redis pre‑decrement
Before the sale starts, stock is pre‑loaded into Redis (e.g., redis.set(goodsId, 100)). Each successful order atomically decrements the Redis key; if the key reaches zero, further orders are rejected. Lua scripts can ensure atomicity when handling cancellations.
2.8 Rate limiting
Multiple layers of rate limiting are applied:
Frontend limit: disable the purchase button for a few seconds after a click.
Per‑user repeat limit: reject requests from the same user within a configurable window (e.g., 10 seconds) using Redis key expiration.
Token‑bucket algorithm: Guava’s RateLimiter produces tokens at a fixed rate; only requests that acquire a token are processed.
Example of a simple token‑bucket limiter:
public class TestRateLimiter {
public static void main(String[] args) {
// 1 token per second
final RateLimiter rateLimiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
double waitTime = rateLimiter.acquire();
System.out.println("Task " + i + " wait time " + waitTime);
}
System.out.println("Finished");
}
}A variant with a timeout uses tryAcquire to discard tasks that cannot obtain a token within 0.5 seconds:
public class TestRateLimiter2 {
public static void main(String[] args) {
final RateLimiter rateLimiter = RateLimiter.create(1);
for (int i = 0; i < 10; i++) {
long timeout = (long) 0.5;
boolean isValid = rateLimiter.tryAcquire(timeout, TimeUnit.SECONDS);
System.out.println("Task " + i + " valid: " + isValid);
if (!isValid) continue;
System.out.println("Task " + i + " executing");
}
System.out.println("End");
}
}2.10 Asynchronous order processing
After passing rate limiting and stock checks, orders are placed onto a message queue (e.g., RabbitMQ). Consumers process the orders asynchronously, providing peak‑shaving, decoupling, and reliability. Successful orders can trigger SMS notifications; failures can be retried via compensation mechanisms.
2.11 Service degradation
If a node crashes during the flash‑sale, a fallback service (e.g., powered by Hystrix) returns a friendly message instead of a hard error, ensuring a graceful degradation of user experience.
Conclusion
The presented architecture—combining dedicated databases, Redis clustering, Nginx load balancing, token‑bucket rate limiting, static pages, and asynchronous processing—can comfortably support hundreds of thousands of concurrent requests. For even larger scales (tens of millions), further techniques such as sharding, Kafka queues, and larger Redis clusters would be required.
By thoughtfully addressing overselling, concurrency, and reliability, developers can build a flash‑sale system that is both performant and resilient.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
