Root Cause Analysis and Optimization of Long Young GC Times in gRPC/Netty Services

Long Young GC pauses in gRPC/Netty services were traced to Netty’s default thread‑local cache creating many MpscArrayQueue objects, and disabling the cache with the JVM options ‑Dio.netty.allocator.useCacheForAllThreads=false and ‑Dio.grpc.netty.shaded.io.netty.allocator.useCacheForAllThreads=false reduced GC time from up to 900 ms to about 100 ms, stabilizing the service.

HelloTech
HelloTech
HelloTech
Root Cause Analysis and Optimization of Long Young GC Times in gRPC/Netty Services

**Problem Scenario**

During each SOA deployment, a small number of errors occur, mainly on upstream services with short RPC timeout settings (e.g., 300 ms). The issue disappears after the deployment finishes. No new version was released, so middleware changes are unlikely the cause.

**Technology Stack**

The SOA framework uses gRPC for communication, and gRPC relies on Netty underneath.

**Investigation – GC Logs**

GC logs show that the 4th and 5th Young GC cycles take an unusually long time, reaching up to 900 ms in production. The same behavior is reproduced in the test environment, where the 4th Young GC also exceeds 500 ms.

**Investigation – Dump File**

The heap dump reveals that the MpscArrayQueue occupies a large portion of memory.

**Root Cause Analysis**

Netty’s thread‑local cache is enabled by default (property io.netty.allocator.useCacheForAllThreads = true). When enabled, Netty creates a PoolThreadCache for each thread, which in turn constructs many MpscArrayQueue objects. These queues consume a lot of memory, leading to long Young GC pauses.

**Solution**

Disable the thread‑cache by adding the following JVM options:

-Dio.netty.allocator.useCacheForAllThreads=false<br/>-Dio.grpc.netty.shaded.io.netty.allocator.useCacheForAllThreads=false

After applying the settings, Young GC time drops to around 100 ms, and the service meets QPS and resource requirements.

**Source Code Insight**

Netty’s source shows the definition of the cache flag and the construction of PoolThreadCache:

private static final boolean DEFAULT_USE_CACHE_FOR_ALL_THREADS;<br/>DEFAULT_USE_CACHE_FOR_ALL_THREADS = SystemPropertyUtil.getBoolean("io.netty.allocator.useCacheForAllThreads", true);

Further down, the initialValue() method creates the cache when the flag is true, allocating several arenas and queues. When the flag is false, a minimal cache (all sizes set to 0) is created, avoiding the heavy MpscArrayQueue structures.

@Override<br/>protected synchronized PoolThreadCache initialValue() {<br/>    final PoolArena<byte[]> heapArena = leastUsedArena(heapArenas);<br/>    final PoolArena<ByteBuffer> directArena = leastUsedArena(directArenas);<br/>    final Thread current = Thread.currentThread();<br/>    // Thread cache switch<br/>    if (useCacheForAllThreads || current instanceof FastThreadLocalThread) {<br/>        final PoolThreadCache cache = new PoolThreadCache(heapArena, directArena, tinyCacheSize, smallCacheSize, normalCacheSize, DEFAULT_MAX_CACHED_BUFFER_CAPACITY, DEFAULT_CACHE_TRIM_INTERVAL);<br/>        // schedule trim task ...<br/>        return cache;<br/>    }<br/>    // No caching so just use 0 as sizes.<br/>    return new PoolThreadCache(heapArena, directArena, 0, 0, 0, 0, 0);<br/>}

**Conclusion**

Disabling Netty’s thread‑local cache removes the large MpscArrayQueue allocations, significantly shortening Young GC pauses and stabilizing the service.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaperformancegRPCNettyGC optimizationThread Cache
HelloTech
Written by

HelloTech

Official Hello technology account, sharing tech insights and developments.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.