Backend Development 14 min read

How to Build a High‑Performance Flash Sale System: Architecture & Code

This article explores the key challenges of designing a flash‑sale (秒杀) backend—such as overselling, high concurrency, request throttling, and database bottlenecks—and presents a complete solution that includes database schema, dynamic URLs, static page rendering, Redis clustering, Nginx load balancing, SQL optimization, token‑bucket rate limiting, asynchronous order processing, and service degradation strategies.

Java Backend Technology

May 14, 2020

How to Build a High‑Performance Flash Sale System: Architecture & Code

Introduction

Flash‑sale systems like those on JD, Taobao, or Xiaomi attract massive traffic in a very short time, raising critical issues such as overselling, high concurrency, request flooding, and database overload.

1. Problems to Consider

Overselling : Limited stock (e.g., 100 items) must not be sold beyond availability.

High Concurrency : Millions of requests may arrive within minutes, risking cache breakdown and DB overload.

Interface Abuse : Bots can repeatedly hit the sale URL; the system must filter invalid requests.

URL Exposure : Users can discover the sale URL via browser tools; dynamic URLs are needed.

Database Design : A dedicated flash‑sale database prevents the main site from being affected by traffic spikes.

Massive Request Volume : Single‑node Redis (~40k QPS) cannot handle hundreds of thousands of requests; clustering is required.

2. Design and Technical Solutions

2.1 Flash‑Sale Database

Separate tables for miaosha_order and miaosha_goods are created, with additional tables for product and user information.

2.2 Dynamic Sale URL

The sale URL is generated by MD5‑hashing a random string, making it unpredictable before the sale starts.

2.3 Static Page Rendering

Product details are rendered into a static HTML page (e.g., using FreeMarker) so that user requests bypass the backend and database.

2.4 Redis Cluster

Because flash‑sale is a read‑heavy, write‑light scenario, a Redis cluster with Sentinel mode is employed to avoid cache breakdown.

2.5 Nginx Front‑End

Nginx handles millions of concurrent connections and forwards traffic to a Tomcat cluster, greatly improving concurrency.

2.6 SQL Optimization

Stock deduction is performed with a single optimistic‑lock update:

update miaosha_goods set stock = stock - 1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

2.7 Redis Pre‑Decrement

Before the sale starts, stock is cached in Redis (e.g., redis.set(goodsId, 100)). Each order atomically decrements the Redis value; if the stock becomes negative, the request is rejected.

2.8 Rate Limiting

Front‑end button disabling for a few seconds after click.

Per‑user repeat‑request blocking within a configurable interval (e.g., 10 s) using Redis key expiration.

Token‑bucket algorithm via Guava RateLimiter:

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time: " + waitTime);
        }
        System.out.println("Done");
    }
}

A second example shows tryAcquire with a 0.5 s timeout, discarding tasks that cannot obtain a token.

public class TestRateLimiter2 {
    public static void main(String[] args) {
        RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            boolean isValid = rateLimiter.tryAcquire(0.5, TimeUnit.SECONDS);
            if (!isValid) continue;
            System.out.println("Task " + i + " executed");
        }
        System.out.println("End");
    }
}

2.9 Asynchronous Order Processing

Valid orders are placed onto a message queue (e.g., RabbitMQ). Consumers process the orders asynchronously, achieving peak‑shaving, decoupling, and reliability. Success can trigger SMS notifications; failures can be retried via compensation mechanisms.

2.10 Service Degradation

If a service crashes, a fallback (e.g., Hystrix circuit breaker) returns a friendly message instead of a server error.

3. Summary

The presented architecture—comprising problem analysis, database isolation, dynamic URLs, static page rendering, Redis clustering, Nginx load balancing, SQL optimization, token‑bucket rate limiting, async queuing, and graceful degradation—can comfortably handle hundreds of thousands of concurrent requests. For larger scales (tens of millions), further techniques such as sharding, Kafka, and larger Redis clusters would be required.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Architecture high concurrency rate limiting flash sale

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.