Backend Development 14 min read

How to Build a High‑Performance Flash‑Sale System: Architecture & Code

This article explores the challenges of designing a flash‑sale (秒杀) system—such as overselling, high concurrency, URL protection, and database bottlenecks—and presents a comprehensive backend architecture using Redis clustering, dynamic URLs, static pages, Nginx, rate‑limiting, asynchronous order processing, and service degradation strategies, complete with code examples.

Java High-Performance Architecture

Sep 20, 2023

How to Build a High‑Performance Flash‑Sale System: Architecture & Code

Preface

Many of us have seen flash‑sale systems like those on JD.com or Taobao. This article examines how to design a robust backend for such high‑traffic events, addressing key problems and proposing technical solutions.

1. Issues to consider in flash‑sale systems

1.1 Overselling problem The most critical issue is preventing overselling; for example, only 100 items are stocked but 200 orders could be placed, which would severely damage the business.

1.2 High concurrency Flash sales generate massive request spikes within minutes, requiring the backend to avoid cache breakdowns and database overload.

1.3 Interface abuse Automated tools can send hundreds of requests per second, so the system must filter invalid or repeated requests.

1.4 Flash‑sale URL exposure Users may discover the sale URL via browser tools and trigger the sale prematurely; the URL must be protected.

1.5 Database isolation The flash‑sale workload should not share the same database with other services to avoid cascading failures.

1.6 Massive request handling Even with Redis caching, a single node may only handle ~40k QPS, while flash sales can demand hundreds of thousands, leading to cache penetration and DB overload.

2. Design and technical solutions

2.1 Flash‑sale database design Create a dedicated database with at least two tables: a flash‑sale order table and a flash‑sale product table. Additional tables for product details and user information are also recommended.

2.2 Dynamic flash‑sale URL Generate the sale URL using an MD5 hash of random characters, making it impossible to guess before the sale starts.

2.3 Page staticization Render product details into static HTML pages using a template engine (e.g., FreeMarker) to eliminate backend calls during the sale.

2.4 Redis cluster Deploy Redis in a sentinel or cluster mode to improve performance and availability, mitigating cache‑breakdown risks.

2.5 Use Nginx

Nginx can handle tens of thousands of concurrent connections, forwarding requests to a Tomcat cluster for better scalability.

2.6 Simplify SQL

Combine stock check and decrement into a single statement with optimistic locking:

update miaosha_goods set stock = stock - 1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

This prevents overselling while avoiding multiple queries.

2.7 Redis pre‑decrement

Store initial stock in Redis (e.g., redis.set(goodsId, 100)) and decrement atomically on each order, using Lua scripts to ensure consistency when cancellations occur.

2.8 Interface rate limiting

Various strategies are applied:

2.8.1 Front‑end throttling Disable the purchase button for a few seconds after a click.

2.8.2 Repeat request rejection Use Redis keys with a short TTL (e.g., 10 s) to block repeated submissions from the same user.

2.8.3 Token‑bucket algorithm Guava’s RateLimiter generates tokens at a fixed rate; only requests that acquire a token are processed.

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time " + waitTime);
        }
        System.out.println("Done");
    }
}

A variant with tryAcquire and a timeout discards requests that cannot obtain a token within the specified period.

public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            long timeout = (long) 0.5;
            boolean valid = rateLimiter.tryAcquire(timeout, TimeUnit.SECONDS);
            System.out.println("Task " + i + " valid: " + valid);
            if (!valid) continue;
            System.out.println("Task " + i + " executing");
        }
        System.out.println("End");
    }
}

Testing shows only the first request obtains a token; subsequent ones are rejected.

2.9 Asynchronous order processing

After passing rate limiting and stock checks, place order messages into a queue (e.g., RabbitMQ) for asynchronous handling, allowing retries and compensation on failures.

3. Service degradation

If a server crashes, fallback to a backup service (e.g., using Hystrix) to return a friendly message instead of a hard error.

Conclusion

The presented architecture can sustain tens of thousands of concurrent requests; larger scales (millions) would require further measures such as database sharding, Kafka queues, and larger Redis clusters.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend System Design high concurrency flash sale

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.