How to Diagnose and Optimize Business System Performance After Launch
This article outlines a comprehensive process for analyzing, diagnosing, and optimizing performance issues in production business systems, covering hardware, OS, database, middleware, JVM tuning, and APM monitoring to identify root causes and implement effective solutions.
System Performance Issue Analysis Process
When a business system that performed well before launch suddenly experiences severe performance problems, the likely causes include high concurrent traffic, growing database size, or changes in critical environment factors such as network bandwidth.
First, determine whether the issue occurs under single‑user conditions or only under concurrency. Single‑user problems are usually easier to test and often stem from code or SQL inefficiencies, while concurrent issues require stress testing to pinpoint bottlenecks in the database or middleware.
During load testing, monitor CPU, memory, and JVM to detect problems like memory leaks that may manifest only under load.
Factors Influencing Performance
Performance is affected by three main aspects: hardware environment, software runtime environment, and the application code itself.
Hardware Environment
Hardware includes compute, storage, and network resources. Server CPU capability is often expressed by TPMC, but real‑world performance can vary. Storage I/O performance is a common bottleneck; high CPU and memory usage may actually be caused by slow disk I/O.
Linux provides tools such as iostat, ps, sar, top, and vmstat for monitoring CPU, memory, JVM, and disk I/O.
Runtime Environment – Database and Middleware
Database Performance Tuning
For Oracle, performance factors include system, database, and network. Optimization targets disk I/O, rollback segments, redo logs, SGA, and database objects. Continuous monitoring is essential, using parameters like TIMED_STATISTICS=TRUE and session statistics.
Application Middleware Tuning
Middleware such as WebLogic or Tomcat requires configuration tuning and JVM parameter optimization. Key JVM settings include -Xmx (max heap), -Xms (initial heap), -XX:MaxNewSize, -XX:NewSize, -XX:MaxPermSize (or Metaspace), -XX:PermSize, and -Xss (thread stack size). Recommended ratios are:
Heap size (Xmx/Xms) ≈ 3‑4 × old‑generation usage after Full GC
Metaspace ≈ 1.2‑1.5 × old‑generation usage
Young generation (Xmn) ≈ 1‑1.5 × old‑generation usage
Old generation ≈ 2‑3 × its usage
Note that newer JVMs replace PermSize with Metaspace, and garbage‑collection strategy must also be considered.
Software Code Performance Issues
Often, performance problems are not due to insufficient resources but to code defects such as excessive loops, unreleased resources, lack of caching, long‑running transactions, or suboptimal data structures and algorithms. Code reviews and static analysis are essential to detect these issues.
Extended Considerations for Business System Performance
Pre‑deployment performance testing may fail to replicate production conditions due to differences in hardware, data volume, and concurrency. Horizontal scaling of databases and middleware can help, but it does not guarantee resolution of underlying performance flaws.
Performance diagnosis can be categorized into:
Operating system and storage layer
Middleware layer (databases, application servers)
Software layer (SQL, business logic, front‑end)
Effective troubleshooting combines static analysis with dynamic monitoring of request flows to pinpoint slow SQL, inefficient code, or resource contention.
Using APM and IT Monitoring
Application Performance Management (APM) tools monitor key business applications, providing early alerts and linking resource usage to specific services, SQL statements, or user actions. Integrating APM with DevOps practices enables proactive detection and rapid root‑cause analysis.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
