Backend Development 10 min read

Designing a High‑Concurrency Flash Sale System: Architecture, Caching, Rate Limiting, and Isolation Strategies

This article explains how to design a flash‑sale (秒杀) system that can handle massive traffic spikes by using static page CDN caching, gateway request interception, Redis inventory control, asynchronous order processing, and thorough business, deployment, and data isolation to ensure high performance and stability without affecting regular services.

Architect
Architect
Architect
Designing a High‑Concurrency Flash Sale System: Architecture, Caching, Rate Limiting, and Isolation Strategies

In e‑commerce flash‑sale scenarios, a sudden surge of users leads to massive page views and order creation pressure, often increasing traffic by dozens or hundreds of times. The article first describes these characteristics and then outlines the overall design considerations.

Business flow considerations: Because flash‑sale items are sold at very low prices and most successful orders are paid, the system adopts an "order‑deduct‑inventory" approach, deducting stock at order time and allowing unpaid orders to be treated as normal sales later.

Page staticization: The product detail page is pre‑generated as a static HTML page and uploaded to a CDN for pre‑warming, so that the huge number of page requests are served by the CDN instead of the origin servers, reducing bandwidth and backend load.

Request interception: Front‑end buttons are disabled after click to prevent duplicate submissions. At the gateway (Zuul/Nginx) layer, per‑user rate limiting is applied, and most purchase requests are rejected early as failures, allowing only a limited number of requests (e.g., 200) to reach the backend. Requests are released in small batches (e.g., 5 requests per 100 ms) to avoid robots and to smooth backend load.

Backend service design: For small inventory, the limited requests cause little pressure on backend services. When inventory is large, the bottleneck shifts to the database. The solution is to store stock in Redis (with sharding for hot items) and to process orders via an asynchronous message queue, batching writes to the database every 100 ms or 100 orders.

Isolation strategies:

Business isolation: Separate flash‑sale items from regular sales, generate static pages and pre‑warm Redis stock before the event.

Deployment isolation: Deploy flash‑sale services and gateways on separate clusters and domains to prevent impact on normal services.

Data isolation: Use dedicated Redis and possibly separate databases for flash‑sale data; after the event, remaining stock is merged back into regular inventory.

Network considerations: Coordinate with ISPs and CDN providers to reserve sufficient bandwidth before the flash sale.

Additional details to consider:

Avoid overselling by using atomic Redis DECR or conditional SQL updates.

Implement anti‑scraping limits per user at the gateway.

Apply overall gateway rate limiting to protect the system when traffic exceeds estimates.

Prevent duplicate orders through per‑user limits (e.g., one order per 10 minutes).

Integrate risk‑control to block malicious users.

Deploy firewalls or DDoS protection services at the gateway layer.

The article concludes with a call to share the content if it was helpful.

backend designSystem ArchitectureRedisCDNhigh concurrencyRate Limitingflash sale
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.