Resolving OOM Caused by InMemoryReporterMetrics in Spring Cloud Sleuth Zipkin Integration
This guide details how to diagnose and fix the OutOfMemoryError triggered by InMemoryReporterMetrics in a Spring Cloud Sleuth Zipkin setup, covering environment description, step‑by‑step investigation, heap dump analysis, and the final solution of upgrading the zipkin‑reporter dependency.
Problem description : An application using Spring Cloud Sleuth with Zipkin‑reporter 2.7.3 experiences frequent OOM (GC overhead limit exceeded) and CPU usage reaching 300% due to the InMemoryReporterMetrics component.
Environment : Spring Cloud version F, dependency <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-sleuth-zipkin</artifactId> </dependency> Version: 2.0.0.RELEASE .
Investigation steps :
Check service health URL and process existence.
Inspect logs – discovered OOM errors.
Collect system metrics with top -H -p 11441 and identify threads consuming >90% CPU.
Save diagnostic data: load snapshot, thread list, TCP connections, jstack, jmap, jstat, and heap dump.
Analyze GC logs (e.g., S0 S1 E O M CCS YGC YGCT FGC FGCT GCT 0.00 0.00 100.00 99.94 90.56 87.86 875 9.307 3223 5313.139 5322.446 ) showing excessive old‑generation GC.
Use Eclipse MAT to examine heap dump; large object identified as zipkin2.reporter.InMemoryReporterMetrics (retained size ~925 MB).
Root cause analysis : The older zipkin‑reporter‑2.7.3 jar contains a bug where ConcurrentHashMap<Throwable, AtomicLong> is used, causing massive memory retention. Upgrading to zipkin‑reporter‑2.8.4 (or newer) fixes the issue.
Solution : Replace the old dependency with the newer version, for example:
<!-- zipkin dependency -->
<dependency>
<groupId>io.zipkin.brave</groupId>
<artifactId>brave</artifactId>
<version>5.6.4</version>
</dependency>Optionally add JVM flags to generate heap dumps on OOM:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=path/filename.hprofAfter upgrading and redeploying, the OOM and CPU spikes disappear.
References : Original troubleshooting article (Jianshu) and related GitHub issues/PRs for zipkin‑reporter.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.