Backend Development 8 min read

Flash Sale System Architecture: From QPS Basics to Performance Optimization

This article explains core performance metrics such as QPS, TPS, and concurrency, introduces related indicators like PV, UV, DAU/MAU, describes how to calculate system throughput, outlines testing perspectives, and provides practical design patterns and tips for building robust flash‑sale systems.

Architect's Journey

Sep 11, 2025

Flash Sale System Architecture: From QPS Basics to Performance Optimization

0x1. The Three Core Performance Metrics: QPS, TPS, and Concurrency

QPS – How many requests can the server handle per second?

QPS (Queries Per Second) measures the server’s raw request‑handling capacity. For example, a high‑capacity server may process 100,000 requests per second, while a typical server handles about 10,000.

TPS – How many complete transactions can be finished per second?

TPS (Transactions Per Second) counts full end‑to‑end operations, from the user’s request to the response. Unlike QPS, a single TPS may involve multiple QPS calls (e.g., fetching a menu, calculating price, checking inventory).

Ordering food (TPS): place order → kitchen prepares → delivery → receipt.

During the process the system may issue several QPS queries: query menu, compute price, check stock, etc.

Thus one TPS can contain several QPS, just as one food order may generate multiple messages.

Concurrency – How many requests can be processed simultaneously?

Concurrency represents the number of simultaneous requests the system can handle, analogous to the number of customers a popular tea shop can serve at once without chaos.

0x2. Other Important Indicators: PV, UV, DAU/MAU

PV – Page Views

PV counts every page refresh, so heavy “refreshers” can inflate this number.

UV – Unique Visitors

UV counts distinct visitors. Visiting the same page ten times yields PV=10 but UV=1.

DAU/MAU – Daily/Monthly Active Users

DAU: number of users who check in on a given day.

MAU: number of users who check in during a month.

A high DAU/MAU ratio indicates strong user stickiness and genuine activity.

0x3. System Throughput: How Resilient Is Your Service?

Throughput is the number of requests processed per unit time, determined by three parameters:

QPS/TPS – requests per second.

Concurrency – simultaneous capacity.

Response time – how long a single request takes.

The relationship can be expressed as: QPS = concurrency / average_response_time Example: a tea shop with 5 staff (concurrency) takes 30 seconds to make a cup (response time). Its QPS is 5 / (30/60) = 10 cups per minute.

When load exceeds capacity, response time grows and the system may become unstable, similar to a crowded shop where staff start to falter.

0x4. Performance Testing: Giving Your System a Health Check

User Perspective – Speed Matters

Users only care about the time from clicking a button to seeing the result; delays beyond ~30 seconds lead to negative feedback.

Administrator Perspective – Avoid Resource Exhaustion

CPU and memory should not be overloaded.

Databases must not be “overworked”.

The system must survive peak traffic (e.g., Double‑11 shopping festival) without crashing.

Developer Perspective – No Hidden Pitfalls

Architecture should be clean, not a “spaghetti” codebase.

Database queries need proper indexing.

Memory leaks must be detected and eliminated.

0x5. Key Design Points for Flash‑Sale Systems

1. Layered Throttling – Queue Requests Like a Popular Store

Frontend: use captchas or quizzes to filter bots.

Middle layer: employ message queues to buffer and order requests.

Backend: apply rate‑limiting to prevent sudden overload.

2. Cache Warm‑up – Prepare Hot Items in Advance

Load popular product data into cache ahead of the sale, similar to pre‑making pearls for tea.

3. Stock Deduction – Ensure Atomicity with Redis + Lua

Use Redis scripts to guarantee that a single item cannot be sold to multiple users simultaneously.

4. Static Separation – Pre‑render Unchanging Content

Generate static parts of the product detail page in advance; fetch dynamic data (stock, price) at request time.

0x6. Practical Tips

Stress testing must be aggressive : simulate real traffic to avoid discovering weaknesses after launch.

Monitoring must be comprehensive : maintain dashboards for all critical metrics.

Rate limiting must be stable : like subway control, it’s better to slow down than to let the system explode.

Graceful degradation must be fast : disable non‑essential features first to keep the core service alive.

0x7. Final Recommendations

Ensure sufficient QPS (capacity).

Maintain efficient TPS (service flow).

Identify genuine users (UV).

Never let the system crash under load.

Remember, there is no universally “best” architecture—only the one that fits your specific scenario.

concurrency system design Performance Testing Flash Sale QPS TPS

Written by

Architect's Journey

E‑commerce, SaaS, AI architect; DDD enthusiast; SKILL enthusiast

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.