Performance Tuning of a Spring Boot Backend: Identifying and Resolving Throughput Bottlenecks
The article details a step‑by‑step investigation of a Spring Boot service that failed to meet a 500 requests/second target, analyzes slow SQL, thread‑pool misconfiguration, excessive logging, and high CPU usage, and presents concrete optimizations that roughly doubled throughput.
The author, a senior architect, describes a ToB system that originally had no load testing but suddenly needed to handle at least 500 requests/s for a key interface. Initial calculations suggested 100 threads would be enough, yet a 100‑concurrency test only achieved 50 req/s with CPU usage near 80%.
Background
The service exhibited a minimum response time under 100 ms in single‑thread scenarios, but under load the maximum latency reached 5 seconds and most requests clustered around 4 seconds, far from the desired performance.
Analysis Process – Locating the "slow" causes
Key investigation points included locks (synchronization, distributed, DB), and time‑consuming operations (network calls, SQL). The team added instrumentation to log warnings when response times exceeded thresholds:
Interface response > 500 ms
Remote call > 200 ms
Redis access > 10 ms
SQL execution > 100 ms
Log analysis revealed a slow SQL statement that updated a single‑row inventory table, causing lock contention and accounting for over 80% of the request latency.
<!-- Example slow SQL -->
update table set field = field - 1 where type = 1 and filed > 1;After converting the operation to asynchronous execution, latency improved but throughput still fell short of the goal.
Further Investigation – Thread switching, logging overhead, and STW
Additional logs showed intermittent 100‑ms gaps without obvious work, suggesting thread switches, excessive logging, or stop‑the‑world pauses. The team reduced log level to DEBUG (minor gain) and re‑configured @Async thread pools, limiting core threads to 50, which raised throughput to around 200 req/s.
JVM GC analysis showed frequent Young GC (≈4 times/s) with a 512 MB heap; increasing heap to 4 GB reduced GC frequency but did not further improve throughput.
CPU Utilization Diagnosis
Despite reducing thread count, CPU usage remained high. Investigation focused on two possibilities: hidden extra threads and CPU‑intensive code. Monitoring revealed many threads each using ~10% CPU, but no single hotspot.
Stack traces exposed frequent calls to BeanUtils.getBean(...) which internally invoked createBean , a heavyweight operation involving bean initialization, dependency injection, and proxy creation. Approximately 200 such calls occurred during request processing because Redis utility beans were defined with prototype scope, causing a new bean (and thus a new Redis client) per call.
RedisTool redisTool = BeanUtils.getBean(RedisMaster.class);Switching to direct new instantiation eliminated the prototype overhead.
Timing Measurement Overhead
The code also used manual System.currentTimeMillis() and Hutool StopWatch for timing, which added measurable overhead under high concurrency.
long start = System.currentTimeMillis();
// ...
long end = System.currentTimeMillis();
long runTime = start - end;Final Results
After addressing slow SQL, reducing prototype bean usage, tuning thread pools, and increasing JVM memory, the maximum latency dropped from 5 s to 2 s, the 95th percentile from 4 s to 1 s, and overall throughput roughly doubled, though still short of the original 500 req/s target.
Summary of Optimizations
MySQL: tuned buffer pool, change buffer, redo log.
Code: async execution, thread‑pool size control, Tomcat thread configuration, Druid pool tuning.
JVM: increased heap size, adjusted GC settings.
The author emphasizes the need for systematic performance knowledge and a clear troubleshooting methodology rather than ad‑hoc “try everything” approaches.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.