Design and Technical Solutions for a High‑Concurrency Flash Sale (秒杀) System

This article examines the challenges of building a robust flash‑sale (秒杀) system—such as overselling, high concurrency, request throttling, and database design—and presents a comprehensive backend architecture using Redis clusters, Nginx, token‑bucket rate limiting, asynchronous order processing, and other optimization techniques.

Java Captain
Java Captain
Java Captain
Design and Technical Solutions for a High‑Concurrency Flash Sale (秒杀) System

Flash‑sale (秒杀) systems like those on JD, Taobao, or Xiaomi attract massive traffic in a very short time, which raises critical questions about how to design a backend that can handle overselling, high concurrency, request abuse, and database pressure.

Key issues to consider include:

Overselling: limited stock (e.g., 100 items) must not be sold beyond availability.

High concurrency: millions of requests may arrive within minutes, risking cache breakdown and database overload.

Interface abuse: bots can repeatedly call the sale API, so request validation is essential.

Predictable URLs: exposing the sale URL allows users to bypass the front‑end; URLs should be dynamic and hidden until the sale starts.

Database isolation: the flash‑sale workload should not affect other business services.

Massive request volume: a single Redis instance may handle ~40k QPS, but flash sales can generate hundreds of thousands of QPS, requiring clustering.

Design and technical solutions :

2.1 Flash‑sale database design

A dedicated flash‑sale database isolates the high‑traffic workload. Two core tables are required: miaosha_order (order records) and miaosha_goods (product information). Additional tables for product details and user information can be linked via goods_id and user_id.

2.2 Dynamic flash‑sale URL

The sale URL is generated by MD5‑hashing a random string and is fetched from the backend only after the start time, preventing pre‑knowledge of the endpoint.

2.3 Page staticization

All product details (description, parameters, reviews, images) are rendered into a static HTML page using a template engine (e.g., FreeMarker), eliminating backend calls for each request.

2.4 Redis cluster

Because flash‑sale traffic is read‑heavy, Redis is used as a cache. To avoid cache breakdown, a Redis Sentinel cluster is deployed, improving performance and availability.

2.5 Nginx as a front‑end load balancer

Nginx can handle tens of thousands of concurrent connections, forwarding traffic to a Tomcat cluster, thereby greatly increasing overall concurrency.

2.6 SQL optimization

Instead of a separate SELECT and UPDATE for stock deduction, a single UPDATE with optimistic‑lock versioning is used:

update miaosha_goods set stock = stock - 1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

2.7 Redis pre‑decrement

Before the sale starts, stock is pre‑loaded into Redis (e.g., redis.set(goodsId, 100)). Each successful order atomically decrements the Redis key; if the key reaches zero, further orders are rejected. Lua scripts can ensure atomicity when handling cancellations.

2.8 Rate limiting

Multiple layers of rate limiting are applied:

Frontend limit: disable the purchase button for a few seconds after a click.

Per‑user repeat limit: reject requests from the same user within a configurable window (e.g., 10 seconds) using Redis key expiration.

Token‑bucket algorithm: Guava’s RateLimiter produces tokens at a fixed rate; only requests that acquire a token are processed.

Example of a simple token‑bucket limiter:

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time " + waitTime);
        }
        System.out.println("Finished");
    }
}

A variant with a timeout uses tryAcquire to discard tasks that cannot obtain a token within 0.5 seconds:

public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            long timeout = (long) 0.5;
            boolean isValid = rateLimiter.tryAcquire(timeout, TimeUnit.SECONDS);
            System.out.println("Task " + i + " valid: " + isValid);
            if (!isValid) continue;
            System.out.println("Task " + i + " executing");
        }
        System.out.println("End");
    }
}

2.10 Asynchronous order processing

After passing rate limiting and stock checks, orders are placed onto a message queue (e.g., RabbitMQ). Consumers process the orders asynchronously, providing peak‑shaving, decoupling, and reliability. Successful orders can trigger SMS notifications; failures can be retried via compensation mechanisms.

2.11 Service degradation

If a node crashes during the flash‑sale, a fallback service (e.g., powered by Hystrix) returns a friendly message instead of a hard error, ensuring a graceful degradation of user experience.

Conclusion

The presented architecture—combining dedicated databases, Redis clustering, Nginx load balancing, token‑bucket rate limiting, static pages, and asynchronous processing—can comfortably support hundreds of thousands of concurrent requests. For even larger scales (tens of millions), further techniques such as sharding, Kafka queues, and larger Redis clusters would be required.

By thoughtfully addressing overselling, concurrency, and reliability, developers can build a flash‑sale system that is both performant and resilient.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

backend designhigh concurrencyflash sale
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.