Backend Development 13 min read

Designing a High-Concurrency Flash Sale (秒杀) System

This article examines the challenges of building a robust flash‑sale (秒杀) system and presents a comprehensive backend design—including database schema, dynamic URLs, static pages, Redis clustering, Nginx load balancing, SQL optimization, rate‑limiting, asynchronous order processing, and service degradation—to handle extreme high‑concurrency traffic.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Designing a High-Concurrency Flash Sale (秒杀) System

Flash‑sale (秒杀) systems like those on JD, Taobao, or Xiaomi attract massive traffic in a very short time, raising critical questions about how to implement a reliable backend that prevents overselling, handles high concurrency, blocks automated attacks, and isolates the sale from other services.

Key problems to consider: overselling when stock is limited, sudden spikes of concurrent requests, anti‑scraping measures, exposure of the sale URL, database contention, and the sheer volume of requests that can overwhelm a single server.

Database design: a dedicated flash‑sale database is recommended, typically with two core tables— miaosha_order for orders and miaosha_goods for product information—plus auxiliary tables for product details and user information.

Dynamic URL strategy: generate the sale URL by hashing a random string (e.g., using MD5) so that the URL is unknown until the sale starts, preventing pre‑emptive requests.

Page staticization: render product details, images, and reviews into a static HTML page using a template engine such as FreeMarker, allowing the front‑end to serve content without hitting the backend or database.

Redis clustering: move from a single Redis instance to a Sentinel‑managed cluster to improve availability and handle the high QPS (tens of thousands) typical of flash sales.

NGINX front‑end: place NGINX in front of Tomcat clusters to leverage its ability to handle tens of thousands of concurrent connections, distributing traffic efficiently.

SQL optimization: combine stock check and decrement into a single statement to avoid race conditions. Example:

update miaosha_goods set stock = stock-1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

This uses an optimistic lock (version) to ensure consistency without the overhead of a pessimistic lock.

Redis pre‑decrement: set an initial stock value in Redis (e.g., redis.set(goodsId, 100) ) and decrement atomically on each request, falling back to the database only when necessary. Sample code:

Integer stock = (Integer) redis.get(goodsId);

Atomicity can be guaranteed with Lua scripts.

Rate limiting: multiple layers are applied:

Front‑end button disabling for a few seconds after click.

Per‑user repeat‑request blocking (e.g., reject requests from the same user within 10 seconds using Redis key expiration).

Token‑bucket algorithm using Guava's RateLimiter . Example:

public class TestRateLimiter {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1); // 1 token per second
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time: " + waitTime);
        }
        System.out.println("Done");
    }
}

A second example shows non‑blocking attempts with a timeout:

public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            long timeout = (long) 0.5;
            boolean isValid = rateLimiter.tryAcquire(timeout, TimeUnit.SECONDS);
            System.out.println("Task " + i + " valid: " + isValid);
            if (!isValid) continue;
            System.out.println("Task " + i + " executing");
        }
        System.out.println("End");
    }
}

These snippets illustrate how token acquisition can be used to reject excess requests quickly.

Asynchronous order processing: after passing rate limiting and stock checks, place the order request onto a message queue (e.g., RabbitMQ). Consumers handle the actual order creation, allowing the front‑end to return immediately and improving throughput. Failure handling can be done via compensation/retry mechanisms.

Service degradation: in case of server failures, employ circuit‑breaker patterns (e.g., Hystrix) to provide graceful fallback messages instead of hard errors.

Conclusion: The presented architecture—dedicated database, dynamic URLs, static pages, Redis cluster, NGINX load balancing, optimized SQL, multi‑layer rate limiting, asynchronous queuing, and circuit breaking—can sustain tens of thousands of concurrent requests. For traffic at the hundred‑million level, further scaling such as sharding, Kafka queues, and larger Redis clusters would be required.

backendRedishigh concurrencyRate Limitingflash sale
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.