Understanding QPS, TPS, RT, Concurrency, and Throughput in Backend Systems
This article explains the key performance metrics of backend systems—including QPS, TPS, response time, concurrency, and throughput—provides their definitions, relationships, practical calculation examples, and guidance on optimal thread counts for achieving balanced performance.
1. QPS (Queries Per Second)
QPS stands for Queries Per Second, which measures how many queries a server can handle each second and is commonly used to evaluate the performance of DNS and other query‑driven services.
2. TPS (Transactions Per Second)
TPS is the abbreviation of TransactionsPerSecond , representing the number of complete request‑response transactions processed per second; a transaction begins when a client sends a request and ends when the server replies.
QPS vs TPS: a single page visit generates one TPS, but the same visit may trigger multiple server queries, which are counted as QPS.
3. RT (Response Time)
Response time (RT) is the total time from when a client initiates a request to when it receives the server’s response, and it is a crucial indicator of system speed.
4. Concurrency
Concurrency denotes the number of requests a system can handle simultaneously, reflecting its load‑handling capacity.
5. Throughput
Throughput (system capacity) is closely related to CPU consumption per request, external interfaces, I/O speed, etc.; the main parameters influencing it are QPS/TPS, concurrency, and response time.
QPS (TPS) : number of requests/transactions per second
Concurrency : number of simultaneous requests/transactions
Response Time : usually the average response time
Understanding these three elements allows you to derive their inter‑relationships:
QPS (TPS) = Concurrency / Average Response Time
Concurrency = QPS * Average Response Time
6. Practical Example
Assuming 80% of daily traffic occurs during 20% of the day (peak period), the peak QPS can be calculated as:
Formula: (Total PV * 80%) / (Seconds per day * 20%) = Peak QPS
Machines needed: Peak QPS / QPS per machine = Number of machines
Example calculations:
(3000000 * 0.8) / (86400 * 0.2) = 139 (QPS)
If a single machine can handle 58 QPS, the required number of machines is:
139 / 58 = 3
7. Optimal Thread Count, QPS, and RT
1. Single‑thread QPS formula
QPS = 1000ms / RT
For an RT of 80ms, QPS = 1000 / 80 = 12.5 . With two threads, QPS doubles to 2 * (1000 / 80) = 25 , showing linear growth with thread count, though real‑world limits often apply.
2. Real relationship between QPS and RT
Conceptual diagram (illustrative) of QPS vs. RT:
Actual measured relationship (graph) is shown below:
3. Optimal thread count
Optimal thread count is the point where server resources are fully utilized without contention, calculated as:
Optimal Threads = ((Thread wait time + Thread CPU time) / Thread CPU time) * CPU core count
Characteristics:
Beyond the optimal thread count, QPS plateaus while response time increases; further increase eventually reduces QPS.
Each system has its own optimal thread count, which may vary under different conditions.
Bottleneck resources can be CPU, memory, locks, or I/O; exceeding the optimal thread count leads to resource contention and longer response times.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.