Why a Java App OOMs Despite a 4 GB Heap? Uncovering Netty’s Hidden Native Memory Use
This article documents a step‑by‑step investigation of a Java 11 application that suffered OS‑level OOM despite a 4 GB heap, revealing that multiple Netty PooledByteBufAllocator instances loaded by different class loaders consumed far more native memory than expected.
The article records a recent Java application OOM investigation, aiming to provide a reference for anyone encountering similar issues.
Background
A core application reported occasional OOM killed by the OS. Monitoring showed the container had 8 GB memory, Java 11 with G1 GC using a 4 GB heap and MaxDirectMemorySize set to 1 GB.
Java dumps indicated little heap usage (<3 GB) while the OS reported a much larger RSS, suggesting excessive off‑heap memory consumption.
Problem Conclusion
Multiple different ClassLoaders in the middleware loaded several instances of io.netty.buffer.PooledByteBufAllocator, each with a 1 GB memory quota, causing the actual off‑heap memory usage to exceed the 1 GB limit.
Arthas revealed seven distinct instances of this class, with the rocketmq-client instance alone consuming nearly 1 GB.
Detailed Analysis
Each ClassLoader loads its own PooledByteBufAllocator, each using its own counter to limit off‑heap memory, typically based on MaxDirectMemorySize. This design can prevent the total off‑heap usage from staying within the 1 GB limit.
The application’s ClassLoaders involved were: sentinel's ModuleClassLoader, rocketmq-client's ModuleClassLoader, tair-plugin's ModuleClassLoader, hsf's ModuleClassLoader, XbootModuleClassLoader , pandora-qos-service's ModuleClassLoader, and ele‑enhancer's ModuleClassLoader .
In Java 8 and Java 11, when the JVM option -Dio.netty.tryReflectionSetAccessible=true is enabled, Netty allocates off‑heap memory directly via UNSAFE.allocateMemory, bypassing Java’s DirectMemory API, so the usage is invisible to standard monitoring and not limited by MaxDirectMemorySize.
Investigation Process
1.1 Using NativeMemoryTracking
Adding -XX:NativeMemoryTracking=detail to JVM parameters prints detailed native memory usage. The "Other" category (off‑heap) grew beyond the expected 1 GB, eventually reaching ~1.5 GB.
1.2 Native Memory Leak?
The system used jemalloc as the native allocator. Adjusting its parameters with export MALLOC_CONF="dirty_decay_ms:0,muzzy_decay_ms:0" reduced memory slightly but did not solve the problem.
1.3 Impact of Reducing Off‑Heap Memory
Lowering MaxDirectMemorySize to 512 MB caused MQ message consumption delays, confirming that the RocketMQ client alone needed more than 512 MB of off‑heap memory.
1.4 Measuring Netty’s Memory Usage
Netty’s static variable io.netty.buffer.PooledByteBufAllocator#DEFAULT can be inspected with Arthas. Multiple allocators from different ClassLoaders were found, and their individual memory consumptions summed to over 1 GB, matching the "Other" category observed via NMT.
1.5 Recommendations for the Business
The immediate fix is to reduce the Java heap size, as heap usage is low, and to keep the off‑heap limit higher until middleware can be tuned. Long‑term, the middleware team should revisit Netty’s memory allocation strategy.
Afterword
When troubleshooting excessive Java off‑heap memory, start by checking Netty’s allocation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
