Why My 500/s Throughput Goal Fell Short: Debugging Spring Bean Creation and Thread Bottlenecks

A Java‑based ToB service struggled to meet a 500 requests‑per‑second target, revealing hidden bottlenecks such as slow SQL, excessive prototype bean creation, logging overhead, and misconfigured thread pools, which were systematically identified and resolved to boost throughput and reduce latency.

Programmer DD
Programmer DD
Programmer DD
Why My 500/s Throughput Goal Fell Short: Debugging Spring Bean Creation and Thread Bottlenecks

Our ToB system had never been load‑tested, and a new big client demanded a minimum of 500 requests per second per node. Initial calculations suggested Tomcat with 100 threads could handle the load, but a load test with 100 concurrent users showed only 50 rps and CPU usage near 80%.

Analysis identified two major causes of latency: slow SQL statements causing lock contention and high‑frequency logging. Monitoring rules were added to log warnings when interface response time exceeded 500 ms, remote call latency over 200 ms, Redis access over 10 ms, and SQL execution over 100 ms.

<!-- 主要类似与库存扣减 每次-1 type 只有有限的几种且该表一共就几条数据(一种一条记录)-->
<!-- 压测时可以认为 type = 1 是写死的 -->
update table set field = field - 1 where type = 1 and filed > 1;

After converting the slow SQL to asynchronous execution, the maximum response time dropped from 5 s to 2 s, but throughput remained far below the target.

Further log inspection revealed periodic pauses of several hundred milliseconds without obvious blocking operations, likely caused by thread switching, excessive logging, or stop‑the‑world pauses. Actions taken included raising the log level to DEBUG, shrinking the @Async thread pool to a core size of 50, and increasing JVM heap from 512 MB to 4 GB, which raised throughput to around 200 rps.

CPU usage stayed high even after reducing thread count. Stack traces showed repeated calls to BeanUtils.getBean(RedisMaster.class), creating prototype‑scoped Redis beans for each request. This caused many invocations of Spring’s createBean method, introducing lock contention.

RedisTool redisTool = BeanUtils.getBean(RedisMaster.class);

Replacing prototype beans with direct new instances eliminated the costly bean creation path, and removing redundant timing wrappers (e.g., System.currentTimeMillis and Hutool StopWatch) reduced overhead. After these changes, throughput approached the 500 rps goal and the 95th percentile latency fell from 4 s to 1 s.

Conclusion: the performance problems stemmed from slow database queries, excessive prototype bean creation, heavy logging, and misconfigured thread pools. Future work includes deeper profiling of System.currentTimeMillis usage, further JVM tuning, and systematic performance‑optimization practices.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

optimizationThroughputProfiling
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.