Resolving OOM Caused by InMemoryReporterMetrics in Spring Cloud Sleuth Zipkin Integration
This guide details how to diagnose and fix the OutOfMemoryError triggered by InMemoryReporterMetrics in a Spring Cloud Sleuth Zipkin setup, covering environment description, step‑by‑step investigation, heap dump analysis, and the final solution of upgrading the zipkin‑reporter dependency.
Problem description : An application using Spring Cloud Sleuth with Zipkin‑reporter 2.7.3 experiences frequent OOM (GC overhead limit exceeded) and CPU usage reaching 300% due to the InMemoryReporterMetrics component.
Environment : Spring Cloud version F, dependency
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-zipkin</artifactId>
</dependency>
Version: 2.0.0.RELEASE.
Investigation steps :
Check service health URL and process existence.
Inspect logs – discovered OOM errors.
Collect system metrics with top -H -p 11441 and identify threads consuming >90% CPU.
Save diagnostic data: load snapshot, thread list, TCP connections, jstack, jmap, jstat, and heap dump.
Analyze GC logs (e.g.,
S0 S1 E O M CCS YGC YGCT FGC FGCT GCT
0.00 0.00 100.00 99.94 90.56 87.86 875 9.307 3223 5313.139 5322.446) showing excessive old‑generation GC.
Use Eclipse MAT to examine heap dump; large object identified as zipkin2.reporter.InMemoryReporterMetrics (retained size ~925 MB).
Root cause analysis : The older zipkin‑reporter‑2.7.3 jar contains a bug where ConcurrentHashMap<Throwable, AtomicLong> is used, causing massive memory retention. Upgrading to zipkin‑reporter‑2.8.4 (or newer) fixes the issue.
Solution : Replace the old dependency with the newer version, for example:
<!-- zipkin dependency -->
<dependency>
<groupId>io.zipkin.brave</groupId>
<artifactId>brave</artifactId>
<version>5.6.4</version>
</dependency>Optionally add JVM flags to generate heap dumps on OOM:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=path/filename.hprofAfter upgrading and redeploying, the OOM and CPU spikes disappear.
References : Original troubleshooting article (Jianshu) and related GitHub issues/PRs for zipkin‑reporter.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
