Design and Technical Solutions for a High‑Concurrency Flash Sale (秒杀) System

This article analyzes the challenges of building a flash‑sale system—such as overselling, massive concurrent requests, URL exposure, and database pressure—and presents a comprehensive backend architecture that combines Redis clustering, Nginx load balancing, token‑bucket rate limiting, asynchronous order processing, and service degradation techniques.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Design and Technical Solutions for a High‑Concurrency Flash Sale (秒杀) System

Flash‑sale (秒杀) systems like those on JD or Taobao attract huge traffic in a very short time, leading to problems such as overselling, high concurrency, interface abuse, URL leakage, and database overload. This article explores these issues and proposes a robust backend design.

Key Problems

Overselling: limited stock (e.g., 100 items) may be sold twice the amount, causing financial loss.

High concurrency: millions of requests arrive within minutes, risking cache breakdown and database crashes.

Interface abuse: bots repeatedly hit the backend, requiring anti‑scraping measures.

URL exposure: knowledgeable users can discover the purchase URL via browser dev tools.

Database coupling: a flash‑sale database sharing resources with other services can bring down the whole site.

Massive request volume: a single Redis instance (≈40k QPS) cannot handle hundreds of thousands of QPS.

Design and Technical Solutions

Database Design – Create an isolated flash‑sale database with at least two tables: miaosha_order and miaosha_goods. Additional tables for product details and user information can be added as needed.

Dynamic URL – Generate the sale URL by MD5‑encrypting a random string, making it unpredictable before the sale starts.

Page Staticization – Render product details, images, and reviews into a static HTML page using a template engine (e.g., FreeMarker) to avoid backend hits during the sale.

Redis Cluster – Switch from a single Redis node to a Sentinel‑managed cluster to improve availability and handle higher QPS.

Nginx Front‑End – Use Nginx as a high‑performance reverse proxy to distribute traffic to a Tomcat cluster, greatly increasing concurrent handling capacity.

SQL Optimization – Reduce stock‑deduction from two statements to a single atomic

UPDATE miaosha_goods SET stock = stock - 1 WHERE goods_id = #{goods_id} AND version = #{version} AND stock > 0

, employing optimistic locking.

Redis Pre‑Decrement – Pre‑load stock into Redis and perform atomic decrement via Lua scripts, falling back to the database only on cache miss.

Rate Limiting

Front‑end limit: disable the purchase button for a few seconds after click.

User‑level repeat limit: reject requests from the same user within a configurable interval (e.g., 10 s) using Redis key expiration.

Token‑bucket algorithm: use Guava’s RateLimiter to issue tokens at a controlled rate.

Example code for a simple token bucket:

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time: " + waitTime);
        }
        System.out.println("Finished");
    }
}

A second example shows tryAcquire with a timeout, allowing the system to discard requests that cannot obtain a token within 0.5 seconds.

public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            boolean isValid = rateLimiter.tryAcquire(0.5, TimeUnit.SECONDS);
            if (!isValid) continue;
            System.out.println("Task " + i + " executed");
        }
        System.out.println("End");
    }
}

Asynchronous Order Processing – After passing rate limiting and stock checks, push the order request to a message queue (e.g., RabbitMQ) for asynchronous handling, improving throughput and decoupling services.

Service Degradation – Implement fallback mechanisms (e.g., Hystrix circuit breaker) to return friendly messages when a node fails, preventing full‑stack crashes.

Conclusion

The presented architecture can sustain tens of thousands of concurrent requests; for larger scales (hundreds of millions), further techniques such as sharding, Kafka queues, and larger Redis clusters are required. The design emphasizes thinking about high‑concurrency challenges and applying practical engineering solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend Architecturehigh concurrencyflash sale
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.