Design and Optimization of High‑Concurrency Flash Sale (秒杀) System Architecture
This article explains the business model, challenges, and architectural solutions—including rate limiting, peak shaving, asynchronous processing, caching, and client‑side optimizations—for building a robust, high‑concurrency flash‑sale system in e‑commerce environments.
Preface
Recently I shared the overall idea of running flash‑sale (秒杀) activities in our e‑commerce business, received positive feedback, and now present a concise reference.
Business Overview
Flash‑sale is a timed, limited‑quantity online promotion where users compete to purchase items within a short window; examples include JD.com’s timed sales and Taobao’s one‑yuan rush.
Business Characteristics
Massive Instantaneous Concurrency
During a flash‑sale, simultaneous requests can surge 10‑ to 100‑fold.
Scarce Inventory
Only a tiny number of items are available, so only a few users succeed.
Simple Workflow
The process typically involves placing an order, deducting inventory, and paying for the order.
Technical Challenges
Impact on Existing Services
Co‑locating flash‑sale services with other workloads can cause severe pressure or even system crashes.
Preventing Premature Orders
The order page must be disabled before the sale starts; only browsing is allowed.
Traffic Surge
Product page requests spike before and after the sale, stressing backend servers, databases, and Redis.
Architectural Design Principles
Rate Limiting
Because inventory is limited, most users should be throttled, allowing only a small fraction to reach the backend.
Peak Shaving
Use caching and message‑queue middleware to smooth the instantaneous traffic peak.
Asynchronous Processing
Transform synchronous order‑creation into asynchronous tasks to improve overall availability.
Caching
Move parts of the order and inventory logic to in‑memory caches such as Redis to reduce database load.
Overall Architecture
Client‑Side Optimizations
Flash‑Sale Page
Static‑ize the page and distribute it via CDN to offload static resources from the backend.
Prevent Early Ordering
Include a small JavaScript file that indicates whether the sale has started; the file is not cached by CDN and can be toggled via backend API.
API Layer Optimizations
User‑Level Rate Limiting
Cache responses per user within a time window to reduce repeated hits.
Product‑Level Rate Limiting
Cache product pages so that repeated requests for the same item return the cached content.
SOA Service Layer Optimizations
Employ message queues, asynchronous handling, and a "Fail‑Fast" strategy for requests exceeding system thresholds.
Flash‑Sale Process Flow
The core of the system is a multi‑stage filtering pipeline that gradually reduces instantaneous pressure, protecting the database. If the MQ layer holds, downstream order creation and inventory deduction can be safely scaled; consumer counts are tuned to avoid DB overload.
Summary
Core Idea: Layered Filtering
Intercept as many requests as possible upstream to relieve downstream pressure.
Leverage cache and message queues to accelerate processing and smooth traffic peaks.
References
Flash‑Sale Architecture Optimization – InfoQ (http://www.infoq.com/cn/articles/flash-deal-architecture-optimization)
Internet Flash‑Sale Design – Baijia (https://baijia.baidu.com/s?old_id=108134)
High‑Concurrency Flash‑Sale System Design – Zhihu (https://zhuanlan.zhihu.com/p/25368538)
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
