How a Tiny HashMap Bug Triggered a Massive Memory Leak in a High‑Traffic Microservice
A senior architect introduced a high‑concurrency monitoring feature that used a ConcurrentHashMap without proper equals/hashCode implementations, leading to duplicate keys, race conditions, and severe memory leaks, which were later resolved by correcting the key class and applying atomic map operations.
A new architect with extensive high‑concurrency experience joined the team and was tasked with the "highest concurrency" requirement: collecting the average response time and total request count for every API via an AOP interceptor.
The implementation stored monitoring data in a ConcurrentHashMap without any comments, assuming the simple code needed no further documentation.
After deployment, the service began to suffer from memory overflow. Using Eclipse MAT and jmap, the team discovered millions of MonitorKey and MonitorValue objects lingering in the heap. Monitor$MonitorKey@15aeb7ab The root cause was that MonitorKey did not override equals and hashCode, so identical keys were treated as distinct, causing unbounded growth of map entries.
In addition, the visit method performed a non‑atomic sequence: retrieve the key, check for null, create a new value, and put it back. Concurrent threads could interleave, leading to lost updates:
Thread1: get value for key a → null → create b → put a=b
Thread2: get value for key a → null → create c → put a=cTwo remediation approaches were discussed:
Adding synchronized to the method:
public synchronized void visit(String url, String desc, long timeCost) { ... }Using putIfAbsent with atomic counters:
MonitorKey key = new MonitorKey(url, desc);
MonitorValue value = monitors.putIfAbsent(key, new MonitorValue());
value.count.getAndIncrement();
value.totalTime.getAndAdd(timeCost);
value.avgTime = value.totalTime.get() / value.count.get();The technical director favored the synchronized version for its simplicity, while the team argued for the more efficient putIfAbsent approach.
After fixing the equals / hashCode implementation and applying a proper atomic update strategy, the memory leak disappeared and the service stabilized.
Key lessons: always override hashCode and equals for objects used as map keys, ensure map operations are atomic in high‑concurrency scenarios, and prefer minimal, safe changes when fixing production bugs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
