High‑Concurrency Performance Tuning of a Java SSM E‑commerce Project: Diagnosis, Optimization, and Results
This article details a complete end‑to‑end high‑concurrency tuning process for a Java SSM monolithic e‑commerce system, covering problem identification, root‑cause analysis, a series of JVM, Tomcat, Redis and JDBC optimizations, horizontal scaling, code refactoring, and the resulting stability improvements.
Online system performance tuning requires strong practical skills, precise problem identification, thorough analysis, and effective solutions.
The case study focuses on a single‑module SSM‑based e‑commerce project that includes a flash‑sale (秒杀) module and runs on a deployment consisting of an F5 load balancer, three application servers, a timer server, a Redis server, an image server, and a MySQL master‑slave cluster on Microsoft Cloud.
Project Overview
The project follows a traditional monolithic architecture (SSM) without front‑back separation, which later evolved through vertical, SOA, and micro‑service architectures.
Identified Issues
1. CPU spikes during flash‑sale
Three daily flash‑sale windows cause CPU and memory to surge, with a single application server handling over 3,000 requests.
2. High CPU on a single application server
Monitoring shows abnormal CPU usage on one server.
3. Excessive request counts
Request numbers on the application server are unusually high.
4. Redis connections ~600
Redis client count reaches around 600, leading to connection pool exhaustion.
5. MySQL request overload
MySQL experiences heavy load during the flash‑sale period.
Investigation Process and Analysis
(1) Investigation Scope
Application servers: memory, CPU, request count.
Image server: memory, CPU, request count.
Timer server: memory, CPU, request count.
Redis server: memory, CPU, connection count.
Database server: memory, CPU, connection count.
(2) Findings
Within 30 minutes after the flash‑sale starts, the application server’s CPU and memory explode due to excessive request volume (over 3,000 requests per server).
Redis requests time out, JDBC connections time out, and Full GC occurs 152 times in 24 hours, indicating memory pressure from large objects.
Thread dumps reveal blocked threads, deadlocks, and more than 2,000 threads requesting invalid resources.
(3) Root Causes
Request surge during flash‑sale overloads application servers.
Redis connection pool exhaustion.
JDBC connection pool exhaustion.
Large object allocations causing frequent Full GC.
Improper Tomcat, JVM, Jedis, and JDBC parameters.
No traffic shaping or rate limiting.
Resources (Redis, JDBC) not released promptly.
Final Solutions
1. Horizontal scaling and traffic shaping
Added more application servers to distribute load, using hardware load balancing (no MQ available).
2. JVM parameter optimization
JAVA_OPTS="-server -Xmx9g -Xms9g -Xmn3g -Xss500k -XX:+DisableExplicitGC -XX:MetaspaceSize=2048m -XX:MaxMetaspaceSize=2048m -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:LargePageSizeInBytes=128m -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -Dfile.encoding=UTF8 -Duser.timezone=GMT+08"The next article will discuss the theoretical basis of these JVM settings.
3. Tomcat concurrency tuning
Switched from BIO to NIO2 protocol and adjusted thread pool and connection parameters according to server capacity and traffic patterns.
4. Redis and JDBC tuning
Configuration details omitted for security reasons.
5. Code refactoring
Removed large object allocations.
Ensured timely release of objects and connection resources.
6. Fixing invalid resource requests
Increased cache size in conf/context.xml :
<Resource cachingAllowed="true" cacheMaxSize="102400" />Optimization Results
After several days of observation, the system remained stable with normal CPU, memory, and GC metrics.
1. Basic monitoring
2. GC behavior
3. CPU and memory sampling
CPU:
Memory:
Conclusion
The article demonstrates a full‑cycle high‑concurrency tuning process, from problem identification to solution implementation and post‑tuning observation. It also notes remaining architectural risks such as tight front‑back coupling, lack of service isolation for flash‑sale features, absence of traffic shaping mechanisms, and missing Redis high‑availability clustering.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.