Designing a Robust Flash‑Sale (秒杀) System: Architecture, High Concurrency Handling, and Rate‑Limiting Strategies

This article examines the challenges of building a flash‑sale system—such as overselling, massive concurrent requests, URL exposure, and database pressure—and presents a comprehensive backend design that includes dedicated databases, dynamic URLs, static pages, Redis clustering, Nginx load balancing, SQL optimization, token‑bucket rate limiting, asynchronous order processing, and service degradation techniques.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Designing a Robust Flash‑Sale (秒杀) System: Architecture, High Concurrency Handling, and Rate‑Limiting Strategies

Flash‑sale (秒杀) systems, like those used by JD, Taobao, or Xiaomi, attract huge traffic in a very short time, creating problems such as overselling, high concurrency, request flooding, URL leakage, and database overload. This article explores these issues and proposes a robust backend design.

Key Problems to Consider

Overselling: Limited stock (e.g., 100 items) can be sold out multiple times if not controlled.

High Concurrency: Millions of users may request within minutes, risking cache stampede and database collapse.

Interface Abuse: Automated scripts can send hundreds of requests per second; safeguards are needed.

URL Exposure: Users can discover the sale URL via browser dev tools and bypass front‑end controls.

Database Impact: A flash‑sale sharing the same DB with other services can cause cascading failures.

Massive Request Volume: Even a powerful Redis instance (~40k QPS) may be insufficient for spikes of hundreds of thousands of QPS.

Design and Technical Solutions

Separate Flash‑Sale Database

Use an isolated database with at least two tables: miaosha_order for orders and miaosha_goods for goods. Additional tables for product details and user information are recommended.

Dynamic Flash‑Sale URL

Generate the sale URL dynamically using an MD5 hash of a random string; the front‑end requests the URL from the back‑end, which validates it before allowing the purchase.

Static Page Rendering

Render product description, parameters, transaction records, images, and reviews into a static HTML page (e.g., via FreeMarker) so that user requests bypass the application server and database, reducing load.

Redis Cluster

Deploy Redis in a clustered (sentinel) mode to handle cache‑stampede scenarios and improve availability.

Nginx Front‑End

Place Nginx in front of Tomcat clusters; Nginx can handle tens of thousands of concurrent connections, forwarding them to the back‑end pool.

SQL Optimization

Combine stock check and decrement into a single UPDATE statement with optimistic locking (version field) to avoid double‑query overselling.

Redis Pre‑Decrement

Initialize stock in Redis (e.g., redis.set(goodsId, 100)) and atomically decrement using Lua scripts to ensure consistency, while handling cancellations by incrementing back.

Rate Limiting

Implement multiple layers:

Front‑end throttling: Disable the purchase button for a few seconds after a click.

Per‑user repeat‑request limit: Use Redis key expiration (e.g., 10 s) to reject rapid duplicate requests.

Token‑bucket algorithm: Use Guava's RateLimiter to generate tokens at a controlled rate.

Example of a simple token‑bucket limiter:

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time " + waitTime);
        }
        System.out.println("Execution finished");
    }
}

Running this shows the first task proceeds immediately, while subsequent tasks wait for token generation.

A stricter version using tryAcquire with a timeout rejects tasks that cannot obtain a token quickly:

public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            long timeout = (long)0.5; // seconds
            boolean isValid = rateLimiter.tryAcquire(timeout, TimeUnit.SECONDS);
            System.out.println("Task " + i + " valid: " + isValid);
            if (!isValid) continue;
            System.out.println("Task " + i + " executing");
        }
        System.out.println("Finished");
    }
}

Only the first request obtains a token; the rest are dropped, illustrating the effectiveness of token‑bucket throttling under extreme load.

Asynchronous Order Processing

After rate limiting and stock verification, push valid orders to a message queue (e.g., RabbitMQ) for asynchronous processing, which decouples order creation from the request thread and provides peak‑shaving.

Service Degradation

Use circuit‑breaker tools such as Hystrix to provide fallback responses when a service instance fails, ensuring a graceful user experience instead of hard crashes.

Conclusion

The presented architecture—isolated flash‑sale DB, dynamic URLs, static page rendering, Redis cluster, Nginx load balancing, optimized SQL, token‑bucket rate limiting, asynchronous queuing, and graceful degradation—can comfortably handle hundreds of thousands of concurrent requests. For larger scales (tens of millions), further measures like database sharding, Kafka queues, and larger Redis clusters would be required.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaBackend Architectureredishigh concurrencyrate limitingflash sale
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.