How I Boosted a Java Backend’s Throughput from 50/s to 500/s: A Real‑World Performance Debugging Journey

In this detailed case study, the author walks through diagnosing and fixing severe throughput bottlenecks in a Java Spring‑based B2B service, covering lock contention, slow SQL, excessive logging, thread‑pool tuning, JVM memory adjustments, and the impact of bean creation on performance, ultimately achieving nearly a ten‑fold increase in requests per second.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
How I Boosted a Java Backend’s Throughput from 50/s to 500/s: A Real‑World Performance Debugging Journey

Background

The company’s B2B system had never been load‑tested because its usage was low, but a new key client demanded a minimum throughput of 500 requests per second for core interfaces. Initial estimates suggested that with Tomcat’s 100 threads, each request could be processed in about 200 ms, which seemed easy.

However, when the load test started with 100 concurrent users, the actual throughput was only 50 req/s and CPU usage hovered around 80%.

Analysis Process

Identify “slow” causes

Key suspects were locks (synchronised, distributed, DB) and time‑consuming operations (network calls, SQL). The team added instrumentation to log warnings when response times exceeded thresholds:

Interface response > 500 ms → log warning

Remote call > 200 ms → log warning

Redis access > 10 ms → log warning

SQL execution > 100 ms → log warning

Log analysis revealed a slow SQL statement that caused lock waiting and accounted for over 80% of the request latency.

update table set field = field - 1 where type = 1 and filed > 1;

Switching this SQL to asynchronous execution reduced the maximum response time from 5 s to 2 s and the 95th‑percentile from 4 s to 1 s, roughly doubling throughput.

Continue locating “slow” causes

Further log inspection showed long gaps between INFO lines, suggesting thread switches, excessive logging, or stop‑the‑world pauses.

2023-01-04 15:17:05: INFO ...
2023-01-04 15:17:05: INFO ...
2023-01-04 15:17:05: INFO ...

Investigation identified three actions:

Raise log level to DEBUG (small gain)

Replace @Async with a bounded thread pool (reduced active threads from 100 to ~50, improving throughput to ~200 req/s)

Increase JVM heap from 512 MB to 4 GB (GC frequency dropped, but throughput did not improve further)

The team also discovered that many beans were defined with prototype scope and fetched via BeanUtils.getBean, causing repeated createBean calls and lock contention during high concurrency.

RedisTool redisTool = BeanUtils.getBean(RedisMaster.class);

Replacing prototype beans with direct new instances eliminated the costly bean creation path.

Conclusion

The systematic investigation—starting from slow SQL, moving to logging overhead, thread‑pool configuration, JVM memory, and finally prototype bean creation—raised the single‑node throughput from 50 req/s to nearly 500 req/s, highlighting the importance of precise instrumentation, understanding of Spring internals, and careful resource tuning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaspringThroughputthread poolProfiling
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.