Backend Development 7 min read

Understanding QPS, TPS, RT, Concurrency, and Throughput in Backend Systems

This article explains the key performance metrics of backend systems—including QPS, TPS, response time, concurrency, and throughput—provides their definitions, relationships, practical calculation examples, and guidance on optimal thread counts for achieving balanced performance.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Understanding QPS, TPS, RT, Concurrency, and Throughput in Backend Systems

1. QPS (Queries Per Second)

QPS stands for Queries Per Second, which measures how many queries a server can handle each second and is commonly used to evaluate the performance of DNS and other query‑driven services.

2. TPS (Transactions Per Second)

TPS is the abbreviation of TransactionsPerSecond , representing the number of complete request‑response transactions processed per second; a transaction begins when a client sends a request and ends when the server replies.

QPS vs TPS: a single page visit generates one TPS, but the same visit may trigger multiple server queries, which are counted as QPS.

3. RT (Response Time)

Response time (RT) is the total time from when a client initiates a request to when it receives the server’s response, and it is a crucial indicator of system speed.

4. Concurrency

Concurrency denotes the number of requests a system can handle simultaneously, reflecting its load‑handling capacity.

5. Throughput

Throughput (system capacity) is closely related to CPU consumption per request, external interfaces, I/O speed, etc.; the main parameters influencing it are QPS/TPS, concurrency, and response time.

QPS (TPS) : number of requests/transactions per second

Concurrency : number of simultaneous requests/transactions

Response Time : usually the average response time

Understanding these three elements allows you to derive their inter‑relationships:

QPS (TPS) = Concurrency / Average Response Time

Concurrency = QPS * Average Response Time

6. Practical Example

Assuming 80% of daily traffic occurs during 20% of the day (peak period), the peak QPS can be calculated as:

Formula: (Total PV * 80%) / (Seconds per day * 20%) = Peak QPS

Machines needed: Peak QPS / QPS per machine = Number of machines

Example calculations:

(3000000 * 0.8) / (86400 * 0.2) = 139 (QPS)

If a single machine can handle 58 QPS, the required number of machines is:

139 / 58 = 3

7. Optimal Thread Count, QPS, and RT

1. Single‑thread QPS formula

QPS = 1000ms / RT

For an RT of 80ms, QPS = 1000 / 80 = 12.5 . With two threads, QPS doubles to 2 * (1000 / 80) = 25 , showing linear growth with thread count, though real‑world limits often apply.

2. Real relationship between QPS and RT

Conceptual diagram (illustrative) of QPS vs. RT:

Actual measured relationship (graph) is shown below:

3. Optimal thread count

Optimal thread count is the point where server resources are fully utilized without contention, calculated as:

Optimal Threads = ((Thread wait time + Thread CPU time) / Thread CPU time) * CPU core count

Characteristics:

Beyond the optimal thread count, QPS plateaus while response time increases; further increase eventually reduces QPS.

Each system has its own optimal thread count, which may vary under different conditions.

Bottleneck resources can be CPU, memory, locks, or I/O; exceeding the optimal thread count leads to resource contention and longer response times.

concurrencythroughputbackend performanceThread Optimizationresponse timeQPSTPS
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.