How to Build a Million-User Flash Sale System: Architecture, Scaling & Performance

This article details the end‑to‑end design of a high‑traffic flash‑sale (秒杀) system, covering functional and non‑functional requirements, performance estimations, dynamic/static separation, traffic‑layer control, high‑availability strategies, inventory‑deduction methods, and the key technologies needed for a resilient, scalable backend.

Programmer DD
Programmer DD
Programmer DD
How to Build a Million-User Flash Sale System: Architecture, Scaling & Performance

A large e‑commerce partner collaborated with our company to launch a flash‑sale event, expecting up to 50 million users and a sudden traffic surge that the existing architecture could not handle.

When the event started, the traffic far exceeded expectations, causing CPU, memory, and database overload, and even normal app access was blocked.

Final solution: Rapidly provisioned over 120 cloud servers to sustain the load.

What is a "秒杀" (flash sale)?

Flash sales are short‑duration, limited‑quantity promotions used by platforms like JD and Taobao to attract users with low‑price items.

Functional challenges

Instant: The sale lasts only a few seconds for hot items.

Huge traffic: Low price draws massive user interest.

Limited quantity: Only a small number of items are available.

Additional functional requirements include:

Automatic activation of the purchase button when the sale begins.

Captcha or quiz before order submission to prevent abuse.

Accurate inventory deduction (no over‑ or under‑deduction).

Sale duration of 10 minutes.

Performance metrics estimation

Storage capacity

Since flash‑sale items are few, order storage requirements are minimal.

Concurrency

Assuming 50 million users each visit twice, the peak concurrency is about 250 k requests per second.

Network bandwidth

With an estimated 0.5 KB per request, required bandwidth is roughly 977 Mb/s.

Non‑functional requirements

High availability: Service must remain reachable throughout the sale.

High performance: Users should experience minimal latency.

Scalability: System should gracefully handle traffic beyond estimates.

Design Overview

1. Dynamic/Static Separation

Static resources (CSS, JS, HTML, images) are served separately from dynamic data (user‑specific content), improving cacheability and performance.

2. Traffic Layering

Traffic is filtered through four layers—CDN, reverse‑proxy (Nginx), backend services, and database—to drop invalid requests early.

3. High Availability

Avoid single‑node designs; use stateless services, redundant instances, and failover mechanisms.

4. Inventory Deduction

Order‑time deduction.

Payment‑time deduction.

Pre‑deduction with lock‑release after timeout.

Detailed Design

Dynamic/Static Separation Design

Static files are cached at the edge, while dynamic data is generated on demand. Page staticization can pre‑render dynamic data into static pages for faster delivery.

Traffic Layering Design

CDN caches static assets; reverse‑proxy (Nginx) handles rate limiting and Lua‑based countdown timers; backend services process business logic; database layer uses read/write splitting and row‑lock reduction.

High‑Availability Design

Deploy services as stateless containers, replicate instances, and use load balancers to distribute traffic.

Inventory Deduction Design

Use Redis for fast stock decrement; for complex cases, employ asynchronous processing with message queues and order queues.

Enqueue orders on submission.

Workers dequeue and perform stock deduction.

Successful orders proceed to payment.

Key Technologies for a Million‑User Flash Sale System

Data staticization and multi‑level caching.

Distributed cache (e.g., Redis).

Load balancing and reverse‑proxy (LVS, Nginx).

Asynchronous processing (message queues, queuing systems).

Microservice architecture and modular design.

Comprehensive monitoring (logs, metrics).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

System Architecturecachinghigh concurrencytraffic controlinventory managementflash sale
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.