How I Diagnosed and Fixed a Persistent Java OOM Crash in Mybatis
The article recounts a production OutOfMemoryError incident caused by heap and metaspace exhaustion in a Java service, analyzes Mybatis's role in the memory leak, reproduces the issue with heavy SQL and multithreading, and offers concrete debugging and code‑optimization steps to prevent future crashes.
Preface
After a previous CPU alarm, the service started throwing frequent OutOfMemoryError (OOM) messages, causing multiple restarts and a complete outage. Using Skywalking, all traces turned red, and the distributed deployment on Kubernetes kept restarting the service to keep the B‑end product running.
Reasons for OutOfMemoryError
The OutOfMemoryError occurs mainly due to two reasons: insufficient heap memory and insufficient metaspace.
Heap memory shortage: objects with strong references cannot be garbage‑collected, exhausting the -Xmx limit and triggering a heap overflow.
Metaspace: introduced in Java 8 to replace the permanent generation. It resides off‑heap, storing class metadata via pointers. Improper metaspace usage can also lead to OOM.
Common Heap OOM Scenarios
Loading excessively large database query results into memory.
Infinite loops that keep large objects referenced.
Resource pools or I/O streams not released after use.
Static collections holding references that are never cleared.
These are typical cases, though real‑world problems can be more obscure.
Phenomenon Analysis
The production logs showed Mybatis throwing OOM. Mybatis builds SQL using collection classes; when the generated SQL is large, the collection storing placeholders and parameters grows dramatically, preventing garbage collection and causing memory overflow.
Without tools like jstack or jmap inside the Docker container, thread‑level analysis was impossible, so external articles about Mybatis‑related OOM provided valuable clues.
Mybatis Source Analysis
Inspecting the DynamicContext class reveals a ContextMap (extends HashMap) that stores SQL parameters and placeholders. The ForEachSqlNode calls getBindings, putting these entries into the map. Under high concurrency, these entries cannot be GC‑collected, leading to OOM.
Scenario Reproduction
To reproduce the issue, the SQL IN clause was enlarged and 50 threads were launched while setting the JVM heap to -Xmx256m -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError.
The console showed frequent Full GC events, ultimately leading to OOM.
Summary
After identifying the root cause, the SQL generation was optimized to avoid overly large statements. The incident highlights the importance of disciplined coding and careful SQL construction to prevent hidden memory risks.
Additionally, configuring Docker to retain OOM dump files ensures future failures can be analyzed more effectively.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
