Why Is Async Log4j2 Logging So Slow? A Deep Dive into Disruptor and JNI Overheads

The article investigates a severe performance bottleneck in a Java service caused by massive async Log4j2 logging, analyzes the Disruptor‑based async logger, explores JNI stack‑trace overhead, reproduces the issue with benchmarks, and provides practical recommendations to eliminate the slowdown.

Architect
Architect
Architect
Why Is Async Log4j2 Logging So Slow? A Deep Dive into Disruptor and JNI Overheads

Problem Background

The system imports a mapping of external quality‑inspection items to internal ones. When the mapping contains more than 15,000 entries the template conversion interface takes about 4 seconds; reducing the mapping to roughly 100 entries drops the latency to ~100 ms, indicating a strong correlation between mapping size and response time.

Problem Verification

In a controlled test environment the same latency pattern was reproduced: 15,000+ items → ~4 s, 100 items → ~100 ms.

Initial Diagnosis

Using Alibaba Arthas with the trace command, the majority of the time was spent in log printing. The mapping process generates a massive number of log statements, and the volume grows with the number of mapping entries.

Preliminary Thoughts

Is the logger configured for synchronous printing?

Log4j2 configuration shows that asynchronous logging is enabled.

Could multithreaded resource contention be the cause?

A single‑thread test that logged 20 000 messages took over 600 ms, while a thread‑pool version completed in 2–30 ms, indicating that contention is not the primary bottleneck.

Asynchronous Logging Mechanism

Disruptor Overview

Log4j2’s async logger is built on the LMAX Disruptor framework, which uses a ring buffer where producers publish events and consumers process them.

public synchronized void start() {
    // ...
    disruptor.handleEventsWith(handlers);
    // ...
}

Enqueueing Log Events

The core method logToAsyncDelegate attempts to enqueue a LogEvent into the ring buffer. If the buffer is full, handleQueueFull applies one of three strategies: ENQUEUE (wait), SYNCHRONOUS (log on the caller thread), or DISCARD.

if (!delegate.tryEnqueue(event, this)) {
    // queue full handling
    handleQueueFull(event);
}

Ring Buffer Publish Logic

public boolean tryEnqueue(final LogEvent event, final AsyncLoggerConfig asyncLoggerConfig) {
    final LogEvent logEvent = prepareEvent(event);
    return disruptor.getRingBuffer()
        .tryPublishEvent(translator, logEvent, asyncLoggerConfig);
}

public <A,B> boolean tryPublishEvent(EventTranslatorTwoArg<E,A,B> translator, A arg0, B arg1) {
    try {
        final long sequence = sequencer.tryNext();
        translateAndPublish(translator, sequence, arg0, arg1);
        return true;
    } catch (InsufficientCapacityException e) {
        return false;
    }
}

Root Cause: Location Lookup

Line‑by‑line annotation revealed that the getLocation method is the performance hotspot. It resolves the class, method, file and line number for each log entry.

private StackTraceElement getLocation(String fqcn) {
    return requiresLocation() ? StackLocatorUtil.calcLocation(fqcn) : null;
}
calcLocation

builds a full stack trace via native calls ( Throwable.getStackTrace), which involves JNI and array traversal, incurring significant overhead.

Performance Impact

A micro‑benchmark that replaced the stack‑trace extraction with a handcrafted array reduced logging time for 20 000 messages from ~600 ms to ~50 ms. Official documentation notes that enabling location information can slow logging by 30–100×.

JNI Details (Relevant for Stack Trace)

Stack trace generation uses native methods ( getStackTraceDepth, getStackTraceElement) that require context switches between Java and C++. This contributes to the observed latency.

Mitigation

Remove unnecessary location pattern tokens ( %C %F %l %L %M) from the log format if location information is not required.

Set production logging level to INFO and use DEBUG only for development diagnostics.

Avoid excessive logging in performance‑critical code paths.

Further Exploration (C‑level Implementation)

Relevant OpenJDK source files for deeper investigation:

https://github.com/openjdk/jdk8u/blob/master/jdk/src/share/javavm/export/jvm.h

jdk/src/share/native/java/lang/Throwable.c
hotspot/src/share/vm/prims/jvm.cpp
hotspot/src/share/vm/classfile/javaClasses.cpp

References

Disruptor concurrency framework introduction and analysis (https://blog.csdn.net/q5926167/article/details/129798235)

Log4j2 async logging deep dive (https://blog.csdn.net/weixin_30861797/article/details/95569540)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaDisruptorperformance analysisJNIlog4j2async logging
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.