Design and Optimization of High‑Concurrency Flash Sale (秒杀) System Architecture
This article outlines the business model, challenges, and architectural design principles for building a high‑concurrency flash‑sale system, covering client‑side optimizations, API and service‑layer safeguards, traffic throttling, caching, asynchronous processing, and overall flow to ensure reliability under massive load.
Preface
Recently I shared the overall idea of implementing flash‑sale (秒杀) activities in an e‑commerce business within my department, received positive feedback, and decided to organize the material for broader reference.
Business Introduction
What is a flash‑sale? In simple terms, it is an online limited‑time purchase event organized by merchants for promotion.
For example, JD.com’s flash‑sale is a timed, quantity‑limited event that ends at a scheduled time regardless of whether all items are sold out. The time constraint is not extremely strict; users who act quickly have a relatively high chance of success.
Taobao previously ran one‑yuan snap‑up events, usually limited to a single item at an extremely low price, often sold out within 1–3 seconds, making participation largely a matter of luck.
Business Characteristics
Massive Instant Concurrency
During a flash‑sale, a huge number of users attempt to purchase simultaneously, causing instantaneous traffic spikes of 10‑fold to 100‑fold or more.
Limited Inventory
Flash‑sale items are typically scarce, meaning only a tiny fraction of users can successfully buy them.
Simple Workflow
The process is straightforward: place an order, deduct inventory, and process payment.
Technical Challenges
Impact on Existing Services
Running a flash‑sale alongside other marketing activities on the same servers can severely affect those services, potentially causing the entire e‑commerce platform to crash.
Prevent Premature Ordering
The order page is a normal URL that must be disabled before the sale starts; users should only be able to view product information, not place orders.
Sudden Traffic Surge
When the flash‑sale starts, many users request the product page, leading to a sharp increase in backend traffic, bandwidth usage, and pressure on databases, Redis, etc.
Architecture Design Philosophy
Rate Limiting
Because inventory is limited, only a small portion of users should reach the backend. Rate limiting restricts the majority of traffic, allowing only a controlled number of requests through.
Peak Shaving
The moment the sale begins creates a traffic spike. Smoothing this peak—using caching and message‑queue middleware—helps prevent overload.
Asynchronous Processing
Treat the flash‑sale as a high‑concurrency system by converting synchronous operations into asynchronous tasks, improving overall availability.
Caching
The bottleneck lies in order placement and inventory deduction, which rely on OLTP databases (MySQL, SQLServer, Oracle). Moving part of the logic to in‑memory caches or Redis dramatically boosts concurrency.
Overall Architecture
Client‑Side Optimization
Two main issues are addressed:
Flash‑Sale Page
Before the sale starts, many users access the page. To avoid backend overload, the page is fully staticized and distributed via CDN edge nodes, reducing pressure on servers and databases.
Prevent Early Ordering
A small JavaScript file, not cached by CDN, indicates whether the sale has started and provides the dynamic order URL. The file can be toggled via a backend API shortly before the sale.
API Layer Optimization
Since client‑side controls can be bypassed, the server must enforce restrictions, which fall into two categories:
Limit Access Frequency per User
Cache the page per user ID for a short time window, serving the same cached response for repeated requests.
Limit Access Frequency per Product
When many requests query the same product simultaneously, serve a cached page regardless of the requester.
SOA Service Layer Optimization
If the flash‑sale attracts many participants, backend traffic can still overwhelm the system. Solutions include message queues, asynchronous handling, and fail‑fast rejection of requests exceeding a defined threshold.
Flash‑Sale Overall Flowchart
The core of the flash‑sale system is layered filtering, gradually reducing instantaneous pressure and protecting the database. The most stressed component is the MQ queue; as long as it holds, downstream order creation and inventory deduction can be controlled by adjusting consumer counts.
The inventory service locks stock in advance to prevent overselling and runs timeout tasks to reclaim stock from unpaid orders after a defined period.
Conclusion
Intercept as many requests as possible upstream to lessen downstream load.
Leverage caching and message queues to accelerate processing and achieve peak‑shaving.
References
Flash‑Sale Architecture Optimization: http://www.infoq.com/cn/articles/flash-deal-architecture-optimization
Internet Flash‑Sale Design: https://baijia.baidu.com/s?old_id=108134
High‑Concurrency Flash‑Sale System Design: https://zhuanlan.zhihu.com/p/25368538
Source: http://blog.51cto.com/13527416/2085258
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
