Operations 16 min read

Analysis and Optimization of Business System Performance

This article outlines a comprehensive approach to diagnosing and optimizing performance problems in production business systems, covering analysis processes, hardware, OS, database, middleware, JVM tuning, code inefficiencies, and monitoring techniques to identify root causes and improve system reliability.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Analysis and Optimization of Business System Performance

System Performance Issue Analysis Process

When a business system that had no performance issues before launch suddenly exhibits serious performance problems after going live, the potential causes usually fall into three main scenarios: high concurrent access leading to bottlenecks, data volume growth causing database bottlenecks, and other critical environment changes such as network bandwidth.

It is essential to first determine whether the problem exists under single‑user (non‑concurrent) conditions or only under concurrency. Single‑user issues are often easier to test and verify, while concurrent issues require stress testing in a test environment to confirm.

If a single‑user performance problem is found, most of the cause lies in program code and SQL that need further optimization. For concurrent performance problems, the focus shifts to analyzing the state of the database and middleware and possibly tuning the middleware itself.

During stress testing, CPU, memory, and JVM should be monitored to detect issues such as memory leaks, because concurrency‑related performance problems can also stem from code defects.

Factors Influencing Performance Problems

Performance issues are influenced by three major aspects: hardware environment, software runtime environment, and the software program itself.

Hardware Environment

The hardware environment includes compute, storage, and network resources. Server compute capability is often expressed by TPMC parameters, but real‑world performance can vary. Storage performance, especially I/O read/write speed, is a frequent bottleneck; high CPU and memory usage may actually mask an underlying I/O limitation.

Linux provides performance monitoring tools such as iostat , ps , sar , top , and vmstat to monitor CPU, memory, JVM, and disk I/O.

Memory usage alerts require distinguishing between high concurrency calls, JVM memory leaks, or disk I/O bottlenecks.

Runtime Environment – Database and Application Middleware

Database and application middleware performance tuning is another common source of performance problems.

Database Performance Tuning

For Oracle databases, performance factors include system, database, and network. Optimization areas cover disk I/O, rollback segments, redo logs, system global area, and database objects.

To start tuning, enable performance monitoring by setting TIMED_STATISTICS=TRUE in init.ora and ALTER SESSION SET STATISTICS=TRUE . Use svrmgrl to connect internally, run utlbstat.sql during normal activity, and later execute utlestat.sql to stop collection. Results are written to report.txt .

Database performance optimization is an ongoing task involving regular parameter inspection and DBA‑driven analysis of high‑memory‑consumption SQL statements, as well as monitoring KPI alerts such as excessive redo log generation.

Application Middleware Performance Analysis and Tuning

Middleware containers (e.g., WebLogic, Tomcat) require configuration parameter optimization and JVM memory tuning. Key parameters include thread pool sizes, connection pool limits, and, in clustered environments, cluster‑specific settings.

JVM tuning is critical; typical parameters are:

-Xmx    # set maximum heap size
-Xms    # set initial heap size
-XX:MaxNewSize    # set maximum young generation size
-XX:NewSize    # set minimum young generation size
-XX:MaxPermSize    # set maximum permanent generation size (replaced by Metaspace in newer JVMs)
-XX:PermSize    # set initial permanent generation size (replaced by Metaspace)
-Xss    # set thread stack size

In the new JVM memory model, Metaspace replaces PermSize, so the ratio between heap and Metaspace must be considered, along with the chosen garbage collection algorithm.

Business System Performance Expansion Considerations

Beyond the standard analysis flow, additional thoughts include the usefulness of pre‑release performance testing and the limits of horizontal scaling.

Is Pre‑Release Performance Testing Useful?

Performance tests often cannot fully replicate production hardware, data volume, or concurrency patterns, making it difficult to catch all issues before launch.

Does Horizontal Scaling Fully Solve Performance Problems?

While database clusters (e.g., Oracle RAC) can provide 2‑3× performance gains, they are not infinitely scalable. Application clusters scale more readily, but performance problems can still arise from accumulated data or inefficient single‑node code.

When a single‑node access is slow, priority should be given to optimizing that node before scaling out.

Classification of Business System Performance Diagnosis

Static classification can be divided into three layers: operating system/storage, middleware (database and application servers), and software (SQL, business logic, front‑end).

Dynamic diagnosis follows the request flow, pinpointing slow SQL, front‑end rendering, or cluster issues.

Software Code Issues Are Often the Root Cause

Common code‑level performance problems include large object initialization inside loops, unreleased resources causing memory leaks, lack of caching, long‑running transactions, and sub‑optimal data structures or algorithms.

These issues are typically uncovered through code reviews and static analysis tools.

Monitoring and APM for Performance Detection

Performance problems are usually discovered via IT resource monitoring/APM alerts or user feedback. APM provides end‑to‑end visibility, linking resource usage to specific services, SQL statements, and business functions.

By integrating resource → application → business layer analysis, APM enables rapid identification of the exact component causing slowdown, greatly improving diagnosis efficiency.

Disclaimer: The material shared in this public account is collected from the internet, the copyright belongs to the original authors, and the views expressed are personal and not representative of the account. The article is for learning and exchange only.

MonitoringPerformanceoperationssystem optimizationJVM TuningDatabase Tuning
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.