Why Did Our Java Service Crash with OOM? A Deep Dive into Root Causes and Fixes
An online service experienced severe latency due to massive GAP times, leading to repeated OutOfMemoryErrors; by analyzing monitoring data, JVM dumps, and SQL queries, the team uncovered a massive userId array causing a 1 GB count query, then implemented request limits and JVM flags to prevent recurrence.
