Determining the Optimal Thread Pool Size Based on CPU Utilization
This article explains how to size a thread pool by understanding CPU core limits, I/O wait effects, empirical testing on a multi‑core machine, and a practical formula, while emphasizing that real‑world workloads require iterative performance testing to find the best thread count.
Many people have seen a simple rule of thumb for thread pool size: CPU‑bound program = number of cores + 1 and I/O‑bound program = number of cores * 2 , but this only applies to ideal cases; real applications must consider many additional factors.
Conclusion
There is no fixed answer for thread‑pool size; it must be determined through testing that reflects the actual workload.
Theoretical Basis
At any moment a CPU core can execute instructions from only one thread.
An "extreme" compute‑bound thread can fully utilize a core; a multi‑core CPU can run at most as many extreme threads as it has cores.
If many extreme threads exceed the core count, unnecessary context switches increase load and slow execution.
When threads spend time in I/O wait, the CPU is idle and the OS can schedule other threads, increasing overall utilization.
The higher the I/O frequency or the longer the wait, the more idle time the CPU has, allowing more threads to run concurrently.
Experimental Verification
Using a simple infinite loop on an AMD Ryzen 5 3600 (6 cores, 12 threads) we observed that a single thread can saturate one core. Adding more threads up to the core count fully occupies each core, while adding more than the core count does not increase CPU usage but raises the load average.
public class CPUUtilizationTest {
public static void main(String[] args) {
// infinite loop doing nothing
while (true) {
}
}
}When I introduced a sleep to simulate I/O, the per‑core utilization dropped to about 50 % for a single thread and rose to around 60 % when using 12 threads. Increasing the thread count to 18 pushed each core close to 100 % utilization, demonstrating that I/O‑bound work allows more threads to keep the CPU busy.
public class CPUUtilizationTest {
public static void main(String[] args) throws InterruptedException {
new Thread(() -> {
while (true) {
// busy loop 100,000,000 iterations
for (int i = 0; i < 100_000_000L; i++) {}
try { Thread.sleep(50); } catch (InterruptedException e) { e.printStackTrace(); }
}
}).start();
}
}Thread Count vs. CPU Utilization
An extreme compute‑bound thread can fully utilize a core; a multi‑core CPU can run at most as many extreme threads as cores.
Running more extreme threads than cores causes excessive context switching and higher load without performance gain.
I/O‑wait periods free the CPU, allowing the OS to schedule additional threads and improve overall utilization.
The more frequent or longer the I/O waits, the greater the idle time, which again permits more concurrent threads.
Formula for Planning Thread Count
According to "Java Concurrency in Practice", the required number of threads to achieve a target CPU utilization can be approximated by:
For a 12‑core CPU aiming for 90 % utilization with a 50 ms sleep (representing I/O) and a 50 ms compute loop, the formula suggests roughly 22 threads.
Thread Count in Real Applications
In practice, exact wait and compute times are hard to measure, and many other threads (Tomcat, HikariCP, JVM compiler, GC) already consume CPU resources. Therefore, the formula provides only a rough estimate; iterative testing is essential.
Typical Process
Identify other processes or services that may interfere on the host.
Inspect existing JVM threads (web server, connection pools, GC, etc.).
Define target metrics: desired CPU utilization, acceptable GC pause frequency, throughput requirements.
Analyze potential bottlenecks (e.g., third‑party service limits, connection pool sizes).
Gradually increase or decrease thread count, run performance tests, and settle on a value that meets the targets.
Different Scenarios
Tomcat's maxThreads differs for blocking vs. non‑blocking I/O.
Dubbo separates I/O threads from business threads; business threads are usually the bottleneck.
Redis 6+ introduces I/O threads, but command processing remains single‑threaded.
In summary, there is no universal thread‑pool size; you must set goals, test, and adjust based on the specific workload and environment.
For simple asynchronous tasks where performance is not critical, using a thread count equal to the number of CPU cores is a reasonable default.
Top Architecture Tech Stack
Sharing Java and Python tech insights, with occasional practical development tool tips.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.