Unlock Powerful Java Performance Analysis with IntelliJ IDEA and JProfiler
This guide explains why Java developers need profiling, introduces IntelliJ IDEA’s built‑in Profiler (powered by Async Profiler and JFR), and provides step‑by‑step instructions with screenshots for CPU, memory, and thread analysis to diagnose slow endpoints, high CPU usage, memory leaks, and concurrency bottlenecks.
Why Profile Java Applications?
Performance bottlenecks appear in Spring Boot, micro‑services, and large back‑end systems. Typical triggers are a slow API, high CPU usage, continuous memory growth, or a complex call chain that needs timing breakdown.
IDEA Integrated Profiler
Since version 2023.2 IntelliJ IDEA bundles a profiler that combines Async Profiler (sampling CPU analyzer with 1‑2% overhead) and Java Flight Recorder (JFR) (Oracle’s JVM event recorder) to deliver CPU, memory, and thread data.
Launching the Profiler
In the main class right‑click → More Run/Debug → select IntelliJ Profiler → choose the entry method (e.g., XXX.main()) and start the application.
For a Spring Boot project, start the app with the profiler, invoke the target API, then click Stop and Show Results. The result window provides flame graphs, call trees, method lists, timelines, memory snapshots, thread dumps, and live charts.
Core Profiler Features
CPU Analysis
Flame Graph visualises call stacks horizontally and depth vertically; block width equals the proportion of CPU time. Yellow blocks represent user code, gray blocks represent library code. Hover shows exact CPU time, clicking drills down, and Ctrl+F searches methods.
Call Tree shows a hierarchical view of method relationships with sorting by CPU time or total time (including wait time), helping locate the hottest methods and their sub‑calls.
Method List enumerates all sampled methods ordered by cumulative time. Each entry can be expanded to reveal callers or callees, useful for spotting high‑frequency short methods.
Timeline visualises thread activity over time, aiding detection of abnormal GC events, live‑locks, or periodic stalls.
Memory Analysis
Memory Snapshot displays object types such as char[], String, Object[] with shallow size, retained size, and GC‑root information. The “Largest Objects” pane ranks objects by retained size, helping identify memory‑heavy buffers, string pools, or caches.
Typical uses are tracking memory‑leak roots, finding unexpected strong references (e.g., static collections, singletons), and comparing snapshots to spot growing object counts or unreleased old objects.
Thread Analysis
Thread Dump captures the state of all threads. The left pane lists thread states (RUNNABLE, WAITING, BLOCKED); the right pane shows each thread’s stack trace, enabling quick identification of blocked, runnable, or deadlocked threads.
Real‑Time CPU & Memory Monitoring
The live chart shows CPU usage, heap memory, non‑heap memory, and thread count. In the example the CPU fluctuates between 2 %–6 % and heap memory stays around 17 MB, indicating an idle or lightly loaded application.
Step‑by‑Step Example (Spring Boot API)
Select IntelliJ Profiler → XXX.main() to launch the project.
Invoke the target API (e.g., A interface).
In the profiler window click Stop and Show Results.
Inspect the flame graph and call tree to locate the hottest method.
If CPU is low but latency persists, capture a memory snapshot and examine retained sizes and GC‑root paths.
When suspecting deadlocks or thread stalls, capture a thread dump and analyse the stack traces.
Analysis Process
Start with a CPU flame graph or call tree for slow APIs; if CPU time is low, switch to memory analysis.
Compare snapshots before and after major code changes to quantify optimization impact.
Focus on yellow blocks in flame graphs—they represent user code and are primary optimisation targets.
Correlate profiler data with application logs and system monitoring to understand root causes.
Key Observations
Flame‑graph width directly indicates the method consuming the most CPU.
Call‑tree sorting by CPU or total time reveals whether waiting contributes to latency.
Retained‑size ranking in memory snapshots uncovers “memory big‑guys” such as large byte buffers or unchecked caches.
GC‑root paths expose which objects prevent collection, useful for locating leaks.
Thread‑dump states and stack traces pinpoint deadlocks, thread‑pool stalls, or long I/O blocks.
Live charts provide immediate feedback on load; a stable low CPU and modest heap suggest no immediate performance issue.
Images
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
