Design Strategies for High‑Concurrency Flash Sale (秒杀) Systems

The article examines the challenges of implementing flash‑sale (秒杀) functionality—such as time synchronization, bot prevention, and massive backend load—and presents two CDN‑centric architectural solutions, including probabilistic request routing and layered filtering with authentication, load balancing, rate limiting, and caching.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Design Strategies for High‑Concurrency Flash Sale (秒杀) Systems

Flash Sale Scenario

Flash sales (秒杀) involve extremely high traffic spikes, often reaching millions of requests in a short time, while requiring strict inventory consistency to prevent overselling.

Key challenges include synchronizing countdown timers across clients, preventing automated bot purchases, and ensuring backend servers can handle the surge.

Design Considerations

Simply scaling hardware is impractical due to cost and the disproportionate increase in required performance. Instead, a distributed architecture that offloads work to edge nodes is necessary.

Technical Solutions

Solution 1 – CDN Edge Service with Probabilistic Routing

Deploy lightweight services on CDN edge nodes to serve static assets and act as an entry point for purchase requests. These services report the number of online users to a central data center.

When the sale starts, the data center sends each edge node a probability value (based on node traffic weight and a factor e). Users are then routed to the central system with a probability proportional to this value, while the rest receive a “sale ended” response.

Solution 2 – Layered Filtering and Rate Limiting

Use CDN for static content, then perform authentication to filter out bots and unauthenticated users.

Authenticated requests are passed through a load balancer (e.g., LVS + Keepalived) to a cluster of Nginx servers, followed by a gateway layer that applies additional rate‑limiting and traffic‑shaping policies.

If traffic still threatens the database, employ service‑level throttling, graceful degradation, and cache pre‑warming. Orders are placed into a task queue for asynchronous processing, ensuring database consistency and handling payment timeouts.

Related Scenarios

The same principles apply to other high‑traffic services such as train ticket booking (12306), where batch releasing of tickets or pre‑sale reservation can mitigate peak load.

Conclusion

Effective flash‑sale systems combine edge computing (CDN), multi‑layer request filtering, load balancing, rate limiting, and caching to protect backend resources while maintaining a consistent user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed architectureload balancingCDNhigh concurrencyrate limitingflash sale
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.