Flash Sale System Architecture: From QPS Basics to Performance Optimization
This article explains core performance metrics such as QPS, TPS, and concurrency, introduces related indicators like PV, UV, DAU/MAU, describes how to calculate system throughput, outlines testing perspectives, and provides practical design patterns and tips for building robust flash‑sale systems.
0x1. The Three Core Performance Metrics: QPS, TPS, and Concurrency
QPS – How many requests can the server handle per second?
QPS (Queries Per Second) measures the server’s raw request‑handling capacity. For example, a high‑capacity server may process 100,000 requests per second, while a typical server handles about 10,000.
TPS – How many complete transactions can be finished per second?
TPS (Transactions Per Second) counts full end‑to‑end operations, from the user’s request to the response. Unlike QPS, a single TPS may involve multiple QPS calls (e.g., fetching a menu, calculating price, checking inventory).
Ordering food (TPS): place order → kitchen prepares → delivery → receipt.
During the process the system may issue several QPS queries: query menu, compute price, check stock, etc.
Thus one TPS can contain several QPS, just as one food order may generate multiple messages.
Concurrency – How many requests can be processed simultaneously?
Concurrency represents the number of simultaneous requests the system can handle, analogous to the number of customers a popular tea shop can serve at once without chaos.
0x2. Other Important Indicators: PV, UV, DAU/MAU
PV – Page Views
PV counts every page refresh, so heavy “refreshers” can inflate this number.
UV – Unique Visitors
UV counts distinct visitors. Visiting the same page ten times yields PV=10 but UV=1.
DAU/MAU – Daily/Monthly Active Users
DAU: number of users who check in on a given day.
MAU: number of users who check in during a month.
A high DAU/MAU ratio indicates strong user stickiness and genuine activity.
0x3. System Throughput: How Resilient Is Your Service?
Throughput is the number of requests processed per unit time, determined by three parameters:
QPS/TPS – requests per second.
Concurrency – simultaneous capacity.
Response time – how long a single request takes.
The relationship can be expressed as: QPS = concurrency / average_response_time Example: a tea shop with 5 staff (concurrency) takes 30 seconds to make a cup (response time). Its QPS is 5 / (30/60) = 10 cups per minute.
When load exceeds capacity, response time grows and the system may become unstable, similar to a crowded shop where staff start to falter.
0x4. Performance Testing: Giving Your System a Health Check
User Perspective – Speed Matters
Users only care about the time from clicking a button to seeing the result; delays beyond ~30 seconds lead to negative feedback.
Administrator Perspective – Avoid Resource Exhaustion
CPU and memory should not be overloaded.
Databases must not be “overworked”.
The system must survive peak traffic (e.g., Double‑11 shopping festival) without crashing.
Developer Perspective – No Hidden Pitfalls
Architecture should be clean, not a “spaghetti” codebase.
Database queries need proper indexing.
Memory leaks must be detected and eliminated.
0x5. Key Design Points for Flash‑Sale Systems
1. Layered Throttling – Queue Requests Like a Popular Store
Frontend: use captchas or quizzes to filter bots.
Middle layer: employ message queues to buffer and order requests.
Backend: apply rate‑limiting to prevent sudden overload.
2. Cache Warm‑up – Prepare Hot Items in Advance
Load popular product data into cache ahead of the sale, similar to pre‑making pearls for tea.
3. Stock Deduction – Ensure Atomicity with Redis + Lua
Use Redis scripts to guarantee that a single item cannot be sold to multiple users simultaneously.
4. Static Separation – Pre‑render Unchanging Content
Generate static parts of the product detail page in advance; fetch dynamic data (stock, price) at request time.
0x6. Practical Tips
Stress testing must be aggressive : simulate real traffic to avoid discovering weaknesses after launch.
Monitoring must be comprehensive : maintain dashboards for all critical metrics.
Rate limiting must be stable : like subway control, it’s better to slow down than to let the system explode.
Graceful degradation must be fast : disable non‑essential features first to keep the core service alive.
0x7. Final Recommendations
Ensure sufficient QPS (capacity).
Maintain efficient TPS (service flow).
Identify genuine users (UV).
Never let the system crash under load.
Remember, there is no universally “best” architecture—only the one that fits your specific scenario.
Architect's Journey
E‑commerce, SaaS, AI architect; DDD enthusiast; SKILL enthusiast
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
