Operations 16 min read

System Performance Issue Analysis and Optimization Process

This article outlines a comprehensive process for diagnosing and optimizing performance problems in production business systems, covering hardware, OS, database, middleware, JVM tuning, code inefficiencies, monitoring tools, and the limitations of pre‑release testing, with practical guidelines and visual references.

Architecture Digest

Jan 17, 2021

System Performance Issue Analysis and Optimization Process

System Performance Issue Analysis Process

When a business system that performed well before launch suddenly experiences serious performance degradation after going live, the root causes usually fall into three categories.

High concurrent access leading to bottlenecks

Growing database volume causing slowdown

Changes in critical environment factors such as network bandwidth

First, determine whether the problem occurs under a single‑user (non‑concurrent) scenario or only under load. Single‑user issues often stem from code or SQL inefficiencies, while concurrent issues require analysis of the database and middleware.

During load testing, monitor CPU, memory, and JVM to detect problems like memory leaks that may also be caused by the application code.

Performance Issue Influencing Factors

Performance factors can be grouped into three main areas: hardware environment, software runtime environment, and the software program itself.

Hardware Environment

Includes compute, storage, and network resources. CPU performance is often expressed by TPMC, but real‑world X86 servers may underperform compared to mainframes with the same TPMC. Storage I/O performance is a frequent bottleneck; high CPU/memory usage may actually be caused by slow disk I/O.

Linux provides tools such as iostat, ps, sar, top, and vmstat for monitoring CPU, memory, JVM, and disk I/O.

Runtime Environment – Database and Application Middleware

Database and middleware performance tuning are common sources of issues.

Database Performance Tuning

For Oracle databases, performance is affected by system, database, and network factors. Optimization includes improving disk I/O, rollback segments, redo logs, system global area, and database objects.

In init.ora set TIMED_STATISTICS=TRUE and in the session execute ALTER SESSION SET STATISTICS=TRUE . Run svrmgrl with connect internal , then during normal activity execute utlbstat.sql to start statistics and utlestat.sql to stop; results are written to report.txt .

Database performance monitoring is an ongoing task; DBAs regularly extract high‑cost SQL statements for developers to review and watch KPI alerts such as excessive redo generation.

Application Middleware Performance Analysis and Tuning

Middleware containers (WebLogic, Tomcat, etc.) require configuration parameter optimization and JVM tuning.

Key JVM parameters include: -Xmx – maximum heap size -Xms – initial heap size -XX:MaxNewSize – maximum young generation size -XX:NewSize – initial young generation size -XX:MaxPermSize – maximum permanent generation (now Metaspace) -XX:PermSize – initial permanent generation (now Metaspace) -Xss – thread stack size

Recommended sizing: set Xmx / Xms to 3–4 times the live old‑generation memory after a Full GC; Metaspace to 1.2–1.5 times; young generation ( Xmn) to 1–1.5 times; old generation to 2–3 times the live objects.

Note: newer JVMs have replaced PermSize with Metaspace, so heap‑to‑Metaspace ratios and garbage‑collector choice must be considered.

Software Program Performance Issue Analysis

Often the first instinct is to add hardware resources, but many performance problems are caused by code defects such as inefficient loops, unreleased resources, lack of caching, long‑running transactions, or sub‑optimal data structures and algorithms.

These issues are best discovered through static code analysis, code reviews, and establishing coding standards.

Business System Performance Issue Expansion Thoughts

Beyond the standard analysis flow, consider whether pre‑release performance testing truly reflects production conditions. Factors such as hardware fidelity, realistic data volume, and genuine concurrency are hard to replicate.

Horizontal scaling (clusters) can mitigate concurrency but does not solve inherent single‑node performance flaws.

Is Pre‑Release Performance Testing Useful?

Challenges include:

Can the test hardware fully emulate production?

Can the data volume reflect real‑world accumulation?

Can concurrency be simulated accurately with recorded scenarios and multiple load generators?

Because these are difficult, many issues surface only after go‑live.

Business System Performance Diagnosis Classification

Static classification can be divided into:

Operating system and storage layer

Middleware layer (databases, application servers)

Software layer (SQL, business logic, front‑end)

Dynamic analysis follows the request path to pinpoint the exact component (SQL, code, or infrastructure) causing slowdown.

Detecting Performance Problems

Two main detection paths:

Monitoring tools and APM alerts

User feedback during operation

APM (Application Performance Management) monitors critical business applications, improves reliability, reduces total cost of ownership, and links resources → applications → business functions.

Traditional monitoring often shows only resource saturation, making it hard to identify the offending service or SQL. Modern APM combined with service‑chain tracing can quickly locate the problematic call or query.

With DevOps and automated operations, proactive APM monitoring enables full‑stack performance analysis, dramatically improving diagnosis efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Monitoring performance APM System Optimization Database Tuning

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.