How to Build a High‑Performance Flash‑Sale System: Architecture, Challenges & Solutions

This article analyzes the complete flash‑sale workflow, identifies its unique characteristics and high‑concurrency technical challenges, and presents a layered architecture—including frontend static pages, request interception, queue design, database sharding, caching, optimistic locking, and anti‑cheating measures—to ensure reliability, scalability and data safety.

21CTO
21CTO
21CTO
How to Build a High‑Performance Flash‑Sale System: Architecture, Challenges & Solutions

1. Flash Sale Business Analysis

Normal e‑commerce flow

(1) Query product; (2) Create order; (3) Decrease inventory; (4) Update order; (5) Pay; (6) Seller ships.

Flash‑sale characteristics

(1) Low price; (2) Massive promotion; (3) Instant sell‑out; (4) Usually scheduled launch; (5) Short duration with extremely high concurrency.

2. Flash‑Sale Technical Challenges

Impact on existing services

Running flash‑sale traffic together with the main site can overload servers and cause a complete outage; the solution is to isolate the flash‑sale system on a separate domain or deployment.

High‑concurrency load on application and database

Users constantly refresh the page before the sale starts, creating massive read traffic that would hit application servers and databases; the solution is to serve a static page cached in CDN so requests never reach the app layer.

Sudden network and bandwidth increase

Assuming a 200 KB page and 10 000 concurrent users, bandwidth spikes to 2 GB; the solution is to purchase extra bandwidth and cache the page in CDN.

Direct order URL

If the order URL is known before the sale, users can bypass the timer; the solution is to generate a dynamic URL with a server‑side random token that becomes valid only when the sale starts.

Controlling the purchase button

The button is gray before the sale and becomes active at the start. Because the page is static and cached, a small JavaScript file is used to flip a flag and inject the order URL when the sale begins, with a versioned filename to avoid CDN caching.

Only the first submitted order should reach the order subsystem

After the first successful order, subsequent submissions are rejected and the button is disabled; limiting the number of concurrent order requests per server and using a distributed lock (e.g., Redis) helps achieve this.

Pre‑order checks

Each order server checks the number of processed requests locally (reject if >10) and also checks the global submitted order count against the total inventory before forwarding to the order subsystem.

3. Flash‑Sale Architecture Principles

Intercept requests as early as possible

Traditional flash‑sale systems choke on backend data‑layer locks; moving most checks upstream reduces load.

Read‑many/write‑few pattern – heavy cache usage

Flash‑sale is a classic read‑heavy scenario (99.9% reads, 0.1% writes); caching dramatically reduces database pressure.

4. Flash‑Sale Architecture Design

4.1 Frontend layer

A static product page with a countdown timer is served from CDN; static resources are split and cached globally to avoid bandwidth bottlenecks.

4.2 Site layer

Rate‑limit per UID or per item by caching responses for a few seconds; this filters out the majority of traffic before it reaches the service layer.

4.3 Service layer

Requests that pass the site layer are queued; a limited‑size request queue (e.g., ConcurrentLinkedQueue) holds high‑volume read requests while write requests are processed sequentially.

package seckill;
import org.apache.http.HttpRequest;
public class PreProcessor {
    private static boolean reminds = true;
    public static boolean checkReminds() {
        if (reminds) {
            if (!RPC.checkReminds()) {
                reminds = false;
            }
        }
        return reminds;
    }
    public static void preProcess(HttpRequest request) {
        if (checkReminds()) {
            RequestQueue.queue.add(request);
        } else {
            // reject request
        }
    }
}

4.4 Database module

An ArrayBlockingQueue stores potentially successful orders; a single‑threaded worker drains the queue and writes to the DB, checking inventory count to stop when sold out.

package seckill;
public class DB {
    public static int count = 10;
    public static ArrayBlockingQueue<BidInfo> bids = new ArrayBlockingQueue<>(10);
    public static void bid() {
        BidInfo info = bids.poll();
        while (count-- > 0) {
            // insert bid, update count
            info = bids.poll();
        }
    }
}

4.5 Database design

Use sharding (range, hash, or router service) and replication groups to achieve high availability, read scalability and write redundancy. A double‑master “shadow‑master” setup provides seamless failover without read‑write lag.

4.6 Scaling read performance

Instead of adding read replicas, employ a large cache layer (Redis/Memcached) with short TTL; write‑through or write‑behind strategies keep cache and DB consistent.

4.7 Handling massive concurrency

Design request interfaces to be ultra‑fast, use in‑memory stores for hot data, and protect the system with overload safeguards at the CGI entry point.

5. Anti‑Cheating Measures

Single account flood

Limit each account to one active request using Redis flags or a per‑account queue.

Multiple accounts (zombie accounts)

Detect high request rates per IP, present captchas or block the IP; also raise participation thresholds (e.g., account level) to filter low‑quality accounts.

IP rotation attacks

When requests mimic legitimate users, behavioral analytics and data‑mining are needed to identify and block malicious patterns.

6. Data Safety Under High Concurrency

Over‑selling problem

Concurrent reads of remaining inventory can cause multiple orders to succeed; solutions include pessimistic locking (not suitable for high QPS), FIFO queues (risk of memory explosion), and optimistic locking with version numbers (e.g., Redis WATCH) to ensure only one order decrements stock.

7. Summary

Flash‑sale and抢购 are typical high‑concurrency scenarios that require isolation, aggressive caching, request throttling, queue‑based processing, sharding, replication, optimistic locking and anti‑cheating mechanisms to achieve reliability, scalability and data consistency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

System Architectureload balancinghigh concurrencydatabase shardingoptimistic lockflash sale
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.