How to Build a High‑Performance Flash Sale System: Architecture & Code

This article examines the key challenges of flash‑sale (秒杀) systems—overselling, massive concurrency, URL exposure, and database coupling—and presents a complete backend design featuring dedicated databases, dynamic URLs, static pages, Redis clustering, Nginx proxy, rate‑limiting, token‑bucket control, asynchronous order queues, and service degradation strategies.

Programmer DD
Programmer DD
Programmer DD
How to Build a High‑Performance Flash Sale System: Architecture & Code

Introduction

Flash sale (秒杀) systems like those on JD, Taobao, and Xiaomi are common. This article explores how to design a flash‑sale backend, the problems to consider, and a practical solution.

Problems to consider

1.1 Overselling

When inventory is limited (e.g., 100 items) but sales exceed stock, the business suffers. Preventing oversell is the top priority.

1.2 High concurrency

Flash sales last only a few minutes and attract massive traffic. The backend must avoid cache breakdown, database overload, and other bottlenecks.

1.3 Interface abuse

Automated tools can send hundreds of requests per second. The system must filter repeated or invalid requests.

1.4 Flash‑sale URL exposure

Users may discover the sale URL via browser dev tools and trigger the sale early. The URL should be hidden or generated dynamically.

1.5 Database coupling

Running the flash‑sale workload on the same database as other services risks cascading failures. A dedicated database isolates the impact.

1.6 Massive request volume

Even with caching, a single Redis instance may not handle tens of thousands of QPS. The design must scale beyond a single node.

Design and technical solutions

2.1 Flash‑sale database

Separate two tables—flash‑sale order and flash‑sale product—from the main database. Additional tables for product details and user information can be added as needed.

2.2 Dynamic flash‑sale URL

Generate the sale URL by MD5‑hashing a random string. The front end obtains the URL from the back end after validation.

2.3 Page staticization

Render product details, parameters, transaction records, images, and reviews into a static page so the front end can serve content without hitting the back‑end services.

2.4 Redis cluster

Use Redis Sentinel or cluster mode to improve performance and availability, mitigating cache‑penetration risks.

2.5 Nginx as front‑end proxy

Nginx can handle tens of thousands of concurrent connections and forward requests to a Tomcat cluster, greatly increasing throughput.

2.6 SQL simplification

Combine inventory check and decrement into a single UPDATE statement with optimistic locking to avoid double queries.

update miaosha_goods set stock = stock-1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

2.7 Redis pre‑decrement

Store the initial stock in Redis. Each successful order decrements the Redis value; cancellations increment it back, using Lua scripts for atomicity.

2.8 Interface rate limiting

2.8.1 Front‑end throttling

Disable the purchase button for a few seconds after a click.

2.8.2 Per‑user repeat request limit

Reject requests from the same user within a configurable time window (e.g., 10 seconds) using Redis key expiration.

2.9 Token‑bucket algorithm

Use Guava’s RateLimiter to allow only a fixed number of requests per second.

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time " + waitTime);
        }
        System.out.println("Done");
    }
}

Another example demonstrates non‑blocking tryAcquire with a timeout.

public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            boolean isValid = rateLimiter.tryAcquire(0.5, TimeUnit.SECONDS);
            if (!isValid) continue;
            System.out.println("Task " + i + " executed");
        }
        System.out.println("Finished");
    }
}

2.10 Asynchronous order processing

After rate limiting and stock verification, push orders to a message queue (e.g., RabbitMQ) for asynchronous handling, with optional compensation for failures.

2.11 Service degradation

If a service crashes, fallback to a standby implementation (e.g., using Hystrix) to return a friendly message instead of a hard error.

Conclusion

The presented architecture can sustain hundreds of thousands of concurrent requests. For tens of millions, further techniques such as database sharding, Kafka queues, and larger Redis clusters are required.

Designing a flash‑sale system forces engineers to confront high‑concurrency challenges and apply practical solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureRedishigh concurrencyrate limitingflash sale
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.