How to Build a Robust Flash‑Sale System that Handles Millions of Requests

This article explores the key challenges of designing a flash‑sale (秒杀) system—such as overselling, high concurrency, request flooding, and database strain—and presents a comprehensive backend architecture with database design, dynamic URLs, caching, rate limiting, async ordering, and service degradation techniques.

21CTO
21CTO
21CTO
How to Build a Robust Flash‑Sale System that Handles Millions of Requests

Problems to Consider

When designing a flash‑sale system, the most critical issues are overselling, high concurrency, request flooding, and database overload. Overselling occurs when inventory is limited but orders exceed stock, leading to financial loss. High concurrency brings massive request spikes within minutes, risking cache breakdown and database crashes. Request flooding from bots or scripts must be mitigated, and the exposure of the sale URL should be protected.

System Design and Technical Solutions

Database Design

A dedicated flash‑sale database isolates high‑traffic operations from the main site. It typically includes two core tables: miaosha_order for orders and miaosha_goods for goods, with additional tables for product details and user information.

Dynamic URL Design

To prevent users from directly accessing the sale endpoint, the URL is generated dynamically using an MD5 hash of a random string. The front‑end first requests the actual URL, and the back‑end validates it before allowing the purchase.

Page Staticization

Static pages containing product descriptions, parameters, and reviews are pre‑rendered (e.g., with FreeMarker) so that user requests do not hit the back‑end or database, greatly reducing server load.

Redis Cluster

Because flash‑sale scenarios are read‑heavy and write‑light, Redis is ideal for caching. To avoid cache breakdown, a Redis cluster with Sentinel mode is recommended for higher performance and availability.

Nginx as Front‑End Proxy

Nginx can handle tens of thousands of concurrent connections, forwarding requests to a Tomcat cluster, thus improving overall concurrency.

SQL Optimization

Stock deduction can be performed with a single SQL statement using optimistic locking:

update miaosha_goods set stock = stock - 1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

Redis Pre‑Decrement

Before the sale starts, set the stock in Redis (e.g., redis.set(goodsId, 100)). Each order atomically decrements the Redis value, falling back to Lua scripts for atomicity when necessary.

Rate Limiting

Both front‑end and back‑end rate limiting are applied. Front‑end disables the button for a few seconds after a click. Back‑end uses token‑bucket algorithms (e.g., Guava's RateLimiter) to allow only a limited number of requests per second.

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time " + waitTime);
        }
        System.out.println("Done");
    }
}

The tryAcquire method can set a timeout; if a token isn’t obtained within the window, the request is dropped.

public class TestRateLimiter2 {
    public static void main(String[] args) {
        RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            boolean isValid = rateLimiter.tryAcquire(0.5, TimeUnit.SECONDS);
            if (!isValid) continue;
            System.out.println("Task " + i + " executed");
        }
        System.out.println("End");
    }
}

Asynchronous Order Processing

Valid orders are placed onto a message queue (e.g., RabbitMQ) for asynchronous processing, providing peak‑shaving, decoupling, and reliability. Successful orders can trigger SMS notifications; failures can be retried with compensation mechanisms.

Service Degradation

If a server crashes during the sale, a fallback service (e.g., using Hystrix) should return a friendly message instead of a hard error.

Summary

The presented architecture can sustain hundreds of thousands of concurrent requests. For tens of millions, further scaling—such as database sharding, Kafka queues, and larger Redis clusters—would be required. Proper design, proactive thinking, and hands‑on practice are essential for handling high‑traffic flash‑sale scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureRedishigh concurrencyDatabase designrate limitingasynchronous processingflash sale
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.