Backend Development 16 min read

Design and Architecture of a High‑Concurrency Flash Sale (秒杀) System

This article presents a comprehensive analysis of flash‑sale (秒杀) system requirements, performance estimations, non‑functional constraints, and detailed architectural designs—including dynamic‑static separation, traffic layering, high‑availability, and stock‑deduction strategies—along with the key technologies needed to support million‑level concurrent traffic.

Wukong Talks Architecture

Oct 29, 2022

Design and Architecture of a High‑Concurrency Flash Sale (秒杀) System

Requirement Analysis

Flash‑sale (秒杀) events, such as Double‑11, involve selling a limited quantity of low‑price items within a very short time window, creating three main challenges: instantaneous bursts, massive traffic, and limited inventory.

Instantaneous: The activity may last only a second for hot items.

Massive traffic: Low price attracts a huge number of users.

Limited quantity: Only a small number of items are available.

Additional functional requirements include auto‑activating the purchase button, captcha/quiz verification before ordering, accurate inventory deduction, and a fixed 10‑minute activity duration.

Performance Metrics Estimation

Based on the described scenario, three key performance indicators are estimated:

Storage capacity: Minimal, as order data volume is low.

Concurrent requests: Approximately 167,000 QPS for 50 million users (2 visits each) over 10 minutes, with a safety margin up to 250,000 QPS.

Network bandwidth: Assuming 0.5 KB per request, about 977 Mb/s is required.

Non‑Functional Requirements

The system must satisfy high availability, high performance, and scalability to handle traffic spikes and prevent malicious attacks.

Overview Design

The design focuses on four key aspects:

Dynamic‑static separation to serve static resources without page refresh.

Traffic layering (CDN, reverse proxy, backend services, database) to filter invalid requests.

High‑availability through multi‑node, stateless services.

Robust stock‑deduction mechanisms.

1. Dynamic‑Static Separation

Static data (CSS/JS, HTML, images, etc.) is cached and served via CDN, while dynamic data (user‑specific content) is generated on demand. Page‑static‑generation techniques and proxy caching improve response speed.

2. Traffic Layering

Four layers control traffic:

CDN layer: Cache static assets close to users.

Reverse‑proxy layer (Nginx): Use Lua scripts for countdown timers and direct cache reads for inventory.

Backend service layer: Isolate services, use independent domains, and apply rate‑limiting.

Database layer: Dedicated read‑write separated database, avoid row locks by sharding inventory IDs.

3. High Availability

Design avoids single‑node bottlenecks by making services stateless and deploying redundant nodes; references are made to the book “High‑Concurrency System Practice”.

4. Stock Deduction Design

Three deduction strategies are discussed: order‑time deduction, payment‑time deduction, and pre‑lock deduction. For flash‑sale scenarios, order‑time deduction is preferred, combined with queue‑based asynchronous processing to ensure fairness.

Cache‑Based Deduction

Store inventory counts in a distributed cache (e.g., Redis) to handle burst traffic, while implementing fault‑tolerance and rate‑limiting.

Asynchronous Processing

Place incoming orders into a message queue, then sequentially consume them to perform stock deduction, ensuring “first‑come‑first‑served” fairness.

Technology Stack for Million‑Level Flash Sale

Static‑content caching (multi‑level cache, CDN)

Load balancing and reverse proxy (LVS, Nginx)

Asynchronous processing (message queues, queuing systems)

System architecture (modularization, micro‑services)

Monitoring (log monitoring, service health checks)

The article concludes with references to further reading in the book “High‑Concurrency System Practice”.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend-architecture Caching High concurrency Flash Sale dynamic static separation stock deduction traffic layering

Written by

Wukong Talks Architecture

Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.