Operations 15 min read

How to Diagnose and Optimize Business System Performance After Launch

This article outlines a comprehensive process for analyzing, diagnosing, and optimizing performance issues in production business systems, covering hardware, OS, database, middleware, JVM settings, code inefficiencies, and the role of monitoring tools like APM to pinpoint bottlenecks.

Java High-Performance Architecture

Dec 20, 2021

How to Diagnose and Optimize Business System Performance After Launch

| System Performance Analysis Process

When a business system shows serious performance problems after going live, the root causes usually fall into three categories: high concurrent access causing bottlenecks, growing data volume in the database, and changes in critical environment factors such as network bandwidth.

First, determine whether the issue appears under single‑user (non‑concurrent) conditions or only under load. Single‑user problems often stem from code or SQL inefficiencies, while concurrent problems require pressure testing to identify resource contention.

During load testing, monitor CPU, memory, and JVM to detect issues like memory leaks that can also cause performance degradation.

| Factors Influencing Performance Issues

Performance is affected by three main layers: hardware environment, software runtime environment, and the application code itself.

Hardware Environment

Includes compute, storage, and network resources. Server CPU capability is often expressed by TPMC, but real‑world performance can vary. Storage I/O performance is a common bottleneck; high CPU and memory usage may mask underlying I/O limits.

Linux provides tools such as iostat, ps, sar, top, and vmstat for monitoring CPU, memory, JVM, and disk I/O.

| Runtime Environment – Database and Application Middleware

Database and middleware tuning are frequent sources of performance problems.

Database Tuning

For Oracle, performance factors include system, database, and network. Optimization targets include disk I/O, rollback segments, redo logs, SGA, and database objects. Continuous monitoring and analysis of high‑memory alerts, excessive redo generation, and inefficient SQL are essential.

Application Middleware Tuning

Middleware containers such as WebLogic or Tomcat require configuration tuning (JVM parameters, thread pools, connection pool sizes) and, in clustered setups, cluster‑specific settings.

Key JVM parameters:

-Xmx   # maximum heap size
-Xms   # minimum heap size
-XX:MaxNewSize   # maximum young generation size
-XX:NewSize      # minimum young generation size
-XX:MaxPermSize  # maximum permanent generation (now Metaspace)
-XX:PermSize     # minimum permanent generation (now Metaspace)
-Xss   # thread stack size

Recommended sizing: set -Xmx and -Xms to 3‑4 times the expected old‑generation usage after a Full GC; Metaspace should be 1.2‑1.5 times the old‑generation usage; young generation ( -Xmn) 1‑1.5 times; old generation 2‑3 times the surviving objects.

In newer JVM memory models, PermSize is replaced by Metaspace, so heap and Metaspace ratios and the chosen garbage collector must be considered.

| Business System Performance Expansion Considerations

Beyond the standard analysis flow, consider whether pre‑deployment performance testing truly reflects production conditions. Simulating real hardware, data volume, and concurrency is difficult, leading to post‑launch surprises.

Horizontal scaling (clusters) can alleviate some bottlenecks, but database scaling is often limited (e.g., Oracle RAC provides only 2‑3× performance). Application clusters scale more readily, yet performance issues may still arise from accumulated data or single‑node inefficiencies.

| Classification of Performance Diagnosis

Operating system and storage layer

Middleware layer (databases, application servers)

Software layer (SQL, business logic, front‑end)

Dynamic analysis traces a request through code and infrastructure to pinpoint slow components, such as inefficient SQL or front‑end rendering delays.

| Common Software Code Performance Pitfalls

Creating large objects or opening DB connections inside loops

Memory leaks due to unreleased resources

Missing caching where appropriate

Long‑running transactions consuming resources

Choosing sub‑optimal data structures or algorithms for a given scenario

Code reviews and static analysis are essential to uncover these issues.

| Detecting Performance Issues via Monitoring and APM

Performance problems can be discovered through IT resource monitoring/APM alerts or user feedback. APM tracks key business applications, improves reliability, and reduces total cost of ownership.

Traditional monitoring focuses on CPU, memory, and network metrics, but linking these to specific services, processes, or SQL statements requires deeper analysis. Modern APM provides end‑to‑end tracing, allowing rapid identification of slow services or queries.

Integrating resource → application → business‑function analysis enables proactive detection and faster resolution of performance bottlenecks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

JVM monitoring Performance Optimization diagnostics System

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.