How to Build a High‑Performance Flash‑Sale System: Architecture, CDN, and Rate‑Limiting Strategies

This article explains how to design a flash‑sale (秒杀) system that handles massive concurrent traffic by staticizing product pages, using CDN pre‑heat, applying gateway rate‑limiting and segmented request release, leveraging Redis and asynchronous queues for inventory and order processing, and isolating services to protect normal business.

ITPUB
ITPUB
ITPUB
How to Build a High‑Performance Flash‑Sale System: Architecture, CDN, and Rate‑Limiting Strategies

Background and Challenges

In e‑commerce flash‑sale events, a huge number of users flood the product detail page minutes before the sale starts and then compete for a very limited inventory, causing traffic spikes of dozens to hundreds of times the normal load and risking system crashes.

Business Flow Considerations

Because flash‑sale items are sold at very low prices and most orders are paid, the system adopts an order‑deduct‑inventory approach: inventory is reduced at order creation, and unpaid orders have minimal impact. Defensive measures against bots and scripts are also required.

Page Staticization

Product detail pages are generated as static HTML before the sale, uploaded to a CDN, and pre‑warmed. This moves the majority of read traffic to the CDN cache, dramatically reducing origin server load and bandwidth consumption.

Static page and CDN architecture
Static page and CDN architecture

Request Interception

Front‑end buttons are disabled after a click to prevent duplicate submissions. At the gateway (Zuul/Nginx) layer, requests are limited per user ID (e.g., one request per few seconds). Most purchase requests are rejected early as failures, greatly easing backend pressure.

For a stock of 200 items, only 200 requests are allowed to reach the backend, released in small batches (e.g., 10 requests per 100 ms) to avoid robot bursts. The release interval must balance robot mitigation with user experience; too long a delay degrades perceived fairness.

Gateway nodes can be clustered to distribute load.

Gateway rate limiting
Gateway rate limiting

Backend Service Design

When the allowed request volume is small, the backend experiences little pressure. For larger inventories (tens of thousands), the bottleneck shifts to the database. Inventory is cached in Redis (sharded for hot items) to achieve high‑throughput decrements.

Order creation is offloaded to an asynchronous message queue. The service pushes order messages to the queue; a consumer writes order data to Redis immediately and batches database inserts (e.g., every 100 ms or 100 orders). The front‑end polls for order status and redirects to payment once the order is persisted.

Backend order processing flow
Backend order processing flow

Isolation Strategies

Business isolation : Flash‑sale items are pre‑registered, static pages generated, and inventory pre‑loaded into Redis before the event.

Deployment isolation : Flash‑sale services run on separate hosts and use a dedicated domain and gateway, preventing failures from affecting normal sales.

Data isolation : Separate Redis clusters and, if needed, separate databases store flash‑sale data. After the sale, remaining stock is merged back into regular inventory. Order data can be synchronized to the normal order system via message queues.

Isolation diagram
Isolation diagram

Network Preparation

Before the sale, bandwidth is reserved with ISPs and CDN providers to handle the anticipated traffic surge.

Additional Details

Prevent overselling by using Redis DECR (atomic) or a database conditional update that checks inventory > 0.

Interface anti‑scraping: rate‑limit per user ID at the gateway.

Global gateway throttling protects the system when traffic exceeds estimates.

Duplicate order prevention via per‑user rate limits (e.g., one order per 10 minutes).

Integrate risk‑control to block suspicious users at the gateway.

Deploy additional DDoS protection (firewall or high‑availability services) at the gateway layer.

By combining static page delivery, CDN caching, fine‑grained request gating, Redis‑backed inventory, asynchronous order processing, and thorough isolation, a flash‑sale system can achieve high performance and stability without disrupting regular e‑commerce operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

e‑commercehigh concurrencyrate limitingflash sale
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.