How to Build a High‑Performance Flash‑Sale System: Architecture, Bottlenecks, and Solutions

This article explains the key challenges of flash‑sale systems—overselling, massive concurrency, URL exposure, and database pressure—and presents a complete backend design with isolated databases, Redis clustering, dynamic URLs, rate‑limiting, asynchronous order processing, and service degradation strategies.

Programmer DD
Programmer DD
Programmer DD
How to Build a High‑Performance Flash‑Sale System: Architecture, Bottlenecks, and Solutions

1. Issues to Consider in Flash‑Sale Systems

1.1 Overselling Problem

When inventory is limited (e.g., 100 items) but requests exceed supply, overselling can severely damage business revenue; preventing overselling is the top priority.

1.2 High Concurrency

Flash sales last only a few minutes and attract massive traffic, which can overwhelm backend services and cause cache breakdown or database overload.

1.3 Interface Anti‑Scraping

Automated tools can send hundreds of requests per second; the system must filter out repeated invalid requests.

1.4 Flash‑Sale URL Exposure

Ordinary users see a disabled button before the start time; advanced users can discover the URL via browser dev tools and bypass the front‑end, so the URL must be protected.

1.5 Database Design Isolation

Running flash‑sale traffic on the same database as other services risks cascading failures; a dedicated flash‑sale database isolates the load.

1.6 Massive Request Handling

Even with Redis caching, a single node may handle only ~40k QPS, far below the potential hundreds of thousands of requests during a flash sale, requiring additional scaling strategies.

2. Design and Technical Solutions

2.1 Flash‑Sale Database Schema

A minimal schema includes a flash‑sale order table and a flash‑sale product table; additional tables for product details and user information can be linked via IDs.

Database schema diagram
Database schema diagram

2.2 Dynamic URL Generation

To prevent pre‑knowledge of the flash‑sale endpoint, generate a random MD5 token as the URL and require the front‑end to fetch it before the sale starts.

2.3 Page Staticization

Render product details, parameters, and reviews into a static HTML page so that user requests bypass the backend and database, reducing server pressure.

2.4 Redis Cluster Upgrade

Use Redis as a read‑heavy cache and deploy it in a Sentinel‑managed cluster to avoid cache breakdown and improve availability.

2.5 Nginx Front‑End Proxy

Place Nginx in front of Tomcat clusters; Nginx can handle tens of thousands of concurrent connections, forwarding requests to backend servers.

2.6 SQL Optimization

Combine inventory check and decrement into a single UPDATE statement with optimistic locking to avoid overselling.

update miaosha_goods set stock = stock-1 where goods_id = #{goods_id} and version = #{version} and stock > 0;

2.7 Redis Pre‑Decrement

Initialize stock in Redis (e.g., redis.set(goodsId, 100)); each order atomically decrements the Redis value, falling back to the database only when necessary.

2.8 Interface Rate Limiting

2.8.1 Front‑End Throttling

Disable the flash‑sale button for a few seconds after the first click to reduce request bursts.

2.8.2 Duplicate Request Rejection

Store a short‑lived key per user in Redis; if the key exists within the configured interval (e.g., 10 s), reject the request.

2.8.3 Token‑Bucket Algorithm

Use Guava's RateLimiter to generate tokens at a fixed rate; only requests that acquire a token are processed.

public class TestRateLimiter {
    public static void main(String[] args) {
        // 1 token per second
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            double waitTime = rateLimiter.acquire();
            System.out.println("Task " + i + " wait time " + waitTime);
        }
        System.out.println("Done");
    }
}
public class TestRateLimiter2 {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1);
        for (int i = 0; i < 10; i++) {
            long timeout = (long) 0.5;
            boolean isValid = rateLimiter.tryAcquire(timeout, TimeUnit.SECONDS);
            System.out.println("Task " + i + " valid: " + isValid);
            if (!isValid) continue;
            System.out.println("Task " + i + " executing");
        }
        System.out.println("End");
    }
}

2.9 Asynchronous Order Processing

After passing rate limiting and stock checks, push the order request into a message queue (e.g., RabbitMQ) for asynchronous handling, achieving decoupling, peak‑shaving, and reliability.

2.10 Service Degradation

If a service crashes during the flash sale, provide a graceful fallback (e.g., a friendly error page) and optionally switch to a standby service using circuit‑breaker tools such as Hystrix.

3. System Architecture Diagram

Flash‑sale system architecture
Flash‑sale system architecture

4. Summary

The presented design can sustain tens of thousands of concurrent requests; for larger scales (hundreds of millions), further measures such as database sharding, Kafka queues, and larger Redis clusters are required. Proper thinking, hands‑on practice, and continuous optimization are essential for building robust high‑concurrency systems.

Flash Salehigh-concurrencyrate-limiting
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.