Boost Java Performance: Optimize JFR Analysis with Flame Graphs and Async‑Profiler
This article explores the evolution of continuous performance profiling, explains why traditional tracing falls short, and details a series of optimizations—including batch processing, object‑reference serialization, aggregation insertion, and multi‑chunk handling—to dramatically reduce memory usage and speed up Java Flight Recorder analysis using async‑profiler and flame graphs.
Background
In 2010 Google published a seminal paper, Google‑Wide Profiling: A Continuous Profiling Infrastructure for Data Centers , introducing a low‑overhead, continuous profiling infrastructure for large‑scale services. Since then many commercial and open‑source solutions have emerged, such as Google Cloud Profiler and Pyroscope, making continuous profiling a core pillar of observability.
Why Performance Profiling?
Traditional observability pillars—tracing, metrics, and logs—are limited by predefined instrumentation and often miss bottlenecks in uninstrumented code. Trace‑based profiling (e.g., SkyWalking Trace Profiling) and sampling‑based profilers (e.g., Elastic APM) address some gaps but still cannot capture object allocation or lock contention details.
Architecture Design
JFR: Java Flight Recorder
A JFR file is a collection of Event records; each file contains many events, each describing a snapshot of the JVM.
Java: async‑profiler
async‑profiler is a low‑overhead Java sampling profiler that uses HotSpot APIs to collect stack traces and memory allocation data. It can generate JFR files or SVG flame graphs and can be integrated via a Java‑Agent.
Flame Graph
A flame graph visualizes stack‑trace samples, with the width of each block representing the frequency of that call stack. Wider blocks at the top usually indicate performance hotspots.
Performance Profiling in Practice: Optimizing JFR File Analysis
Initial Implementation: Native JFR Reader
The original approach used the JDK’s built‑in JFR module to read all events at once. While acceptable for small files, processing a 60 MB JFR file with over two million events caused high memory usage and slow analysis due to GC pressure and repeated tree insertions.
Optimization Attempt: Batch Processing
Reading events in fixed‑size batches (e.g., 1 000) reduced peak memory but did not significantly improve speed because the large file remained open throughout processing.
Optimization: Object References & Lazy Serialization
Async‑profiler stores stack traces in a dictionary and references them by stackTraceId, avoiding full serialization for each event. This reduces memory dramatically (e.g., 2.4 M events share only 30 k unique stack traces).
public class StackTrace {
// method IDs
public final long[] methods;
// byte indicating method type (INTERPRETED, JIT_COMPILED, ...)
public final byte[] types;
// line number / bci for each method
public final int[] locations;
// ...
}Optimization: Aggregated Insertion
Instead of inserting each event’s stack trace string into the tree, we cache stackTraceId with its cumulative count, then perform a single insertion per unique stack trace, reducing insert operations from millions to tens of thousands.
Optimization: Multi‑Chunk Processing
For very large JFR files we split the file into independent chunks, each processed separately. Since async‑profiler’s JfrReader did not support chunked reading, we contributed a Pull Request to add this capability.
Client vs Server Analysis
Two analysis modes are supported:
Client analysis ( HTML): simple scenarios, results rendered directly in the browser.
Server analysis ( JFR): complex scenarios, files are uploaded to the backend, stored in Elasticsearch, and visualized via a flame‑graph component.
Server‑side processing reads JFR files, builds a tree structure, stores it in Elasticsearch, and serves it to the frontend for rendering.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
