Performance Tuning of a Java Backend Service: From 50/s to 500/s Through Profiling, Thread‑Pool, and SQL Optimization
The article details a step‑by‑step investigation and optimization of a Java backend service that initially delivered only 50 requests per second under load, covering profiling, slow‑SQL fixes, thread‑pool tuning, JVM memory adjustments, and Spring bean creation overhead to approach the target 500 req/s.
Background : The author describes a B2B system that originally had no load testing and was required by a major client to achieve at least 500 requests per second on a single node.
Initial expectations : With Tomcat configured for 100 threads, each request could be processed in ~200 ms, which seemed feasible given typical 100 ms response times.
Load test results : At 100 concurrent users the throughput was only 50 req/s and CPU usage reached ~80 %.
Observations from the report : Minimum latency <100 ms, maximum latency up to 5 s, and most requests around 4 s, indicating severe bottlenecks.
First investigation : Identify blocking points such as locks, distributed locks, database locks, and time‑consuming operations (network, SQL). Add instrumentation to log warnings when response time exceeds thresholds (500 ms for API, 200 ms for remote calls, 10 ms for Redis, 100 ms for SQL).
Result : Slow SQL was discovered. Example SQL:
update table set field = field - 1 where type = 1 and filed > 1;The SQL caused lock contention, accounting for >80 % of total latency. The author changed the operation to asynchronous execution, which roughly doubled performance.
Second investigation : Log analysis revealed intermittent 100‑ms pauses without obvious work, possibly caused by thread switching, excessive logging, or stop‑the‑world (STW) pauses. Reducing log level to DEBUG gave only a small gain.
Thread‑pool tuning : The @Async annotation was used with three thread pools, each with a core size of 100. The total core threads were limited to 50, which increased throughput to about 200 req/s.
JVM tuning : GC logs showed YGC frequency of 4 /s with no Full GC; increasing heap from 512 MB to 4 GB reduced YGC to 2 /s but did not significantly improve throughput.
CPU usage remained high after cutting thread count, prompting deeper analysis.
Further investigation : Stack traces showed frequent calls to Spring’s BeanUtils.getBean which triggered createBean for prototype‑scoped beans (e.g., RedisMaster). Each request created many bean instances, adding considerable overhead.
Fix : Replace prototype bean retrieval with direct new Redis() instantiation, eliminating the costly bean creation path.
Additional notes : Using System.currentTimeMillis for timing and Hutool’s StopWatch adds measurable overhead under high concurrency.
Summary of actions : Adjust MySQL buffer pool, redo log, change buffer; refactor code to async execution, tune thread‑pool and Tomcat settings; reconfigure Druid connection pool; increase JVM memory and consider different GC algorithms.
Final outcome : Through a series of trial‑and‑error steps over four days, throughput improved from 50 req/s to nearly 200 req/s, and response time reduced from 5 s to about 1 s.
Open questions : Why does createBean have such a large performance impact? When is prototype scope appropriate?
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
