Cloud Native 13 min read

Why Default Java GC Settings Kill Performance on Kubernetes (And How to Fix It)

Through a controlled experiment with four Spring Boot service groups on Kubernetes, this article shows that relying on Java’s default GC and heap settings can drastically reduce throughput and increase tail latency, especially under higher load, and demonstrates how explicit GC algorithm and Xms/Xmx tuning restores performance.

Tech Musings
Tech Musings
Tech Musings
Why Default Java GC Settings Kill Performance on Kubernetes (And How to Fix It)

Background and Motivation

Java developers often set -Xms and -Xmx explicitly but leave the garbage collector (GC) choice to the JVM defaults, assuming they work well on physical or virtual machines. In container environments, however, the JVM adapts its GC strategy based on cgroup CPU and memory limits, which can lead to sub‑optimal GC selection and heap sizing, especially under high load.

Experiment Design and Deployment Details

Service Groups

Four groups of a Spring Boot REST API were deployed on a Kubernetes cluster using the amazoncorretto:25.0.2‑alpine3.23 base image. The groups differed in GC algorithm, explicit heap settings, CPU/memory limits, and replica count:

group1 : G1GC, -Xms4g -Xmx4g, 4 CPU/4 Gi, 1 replica.

group2 : G1GC, -Xms1g -Xmx1g, 1 CPU/1 Gi, 4 replicas.

group3 : default GC, no explicit heap limits, 4 CPU/4 Gi, 1 replica.

group4 : default GC, no explicit heap limits, 1 CPU/1 Gi, 4 replicas.

Kubernetes Resource Limits

Each Deployment set identical requests and limits for CPU and memory, ensuring the container’s cgroup limits were visible to the JVM.

resources:
  requests:
    memory: "..."
    cpu: "..."
  limits:
    memory: "..."
    cpu: "..."

Load Test Configuration

Application: Spring Boot user‑management API (CRUD + queries).

Load tool: Locust, testing 32, 64, and 128 concurrent users for 300 seconds each.

Timeout: 10 s; think time: 0.1‑0.5 s; request weight: 60 % query, 30 % CRUD, 10 % complex search.

Performance Results: Throughput and Latency

All groups achieved 100 % success rate. Two key metrics were examined: TPS (throughput) and latency percentiles (P50/P95/P99).

32‑User Load (Low Load)

group4 (default, 1 CPU/1 Gi, 4 replicas) showed the best TPS (89.13) and lowest P95/P99, indicating GC was not a bottleneck at low load.

64‑User Load (Medium Load)

Explicitly configured groups (1 and 2) outperformed their default counterparts (3 and 4). For example, group1’s TPS rose from 86.73 to 97.23 (+12 %) and latency percentiles dropped noticeably.

group2 vs. group4 demonstrated a +35 % TPS gain and significant P95/P99 reductions, confirming that tighter containers default to more conservative GC (often Serial) which harms performance.

128‑User Load (High Load)

All groups approached saturation; tail latency (P99) climbed to 6.9‑8.2 seconds, showing that beyond a certain load, merely tuning GC/heap cannot fully compensate.

GC Log Analysis

Actual GC Choices

group1 & group2: G1GC (as explicitly set).

group3: default configuration but JVM still selected G1GC under 4 CPU/4 Gi.

group4: default configuration resulted in Serial GC under 1 CPU/1 Gi.

Sample log from group4:

[2026-03-05T05:58:35.984+0000][0.006s][info][gc,init] CardTable entry size: 512
[2026-03-05T05:58:35.984+0000][0.006s][info][gc] Using Serial
[2026-03-05T05:58:35.984+0000][0.006s][info][gc,init] Version: 25.0.2+10-LTS (release)
In 1 CPU/1 Gi containers, even G1GC can suffer because GC, compiler, and application threads compete for the limited CPU time slice.

Impact of Explicit Heap Settings

Comparing group1 (explicit -Xms4g -Xmx4g) with group3 (default heap) shows that fixed heap size eliminates runtime heap expansion pauses and reduces tail latency. The same holds for group2 vs. group4.

Practical Recommendations for Java on Kubernetes

Resource Limits

Allocate sufficient CPU; low CPU limits throttle both application throughput and GC parallelism, worsening tail latency under load.

Leave headroom in memory limits; tying -Xmx directly to the container limit can cause OOM kills or performance degradation.

For latency‑sensitive services, consider fewer instances with larger per‑instance resources rather than packing many tiny containers.

JVM Settings

Example JAVA_OPTS configuration:

env:
- name: JAVA_OPTS
  value: "-XX:+UseG1GC -Xms4g -Xmx4g -XX:MaxGCPauseMillis=200 -Xlog:gc*,safepoint=info:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=10,filesize=20m"

Explicitly choose a GC algorithm (e.g., G1GC) to avoid automatic fallback to Serial in low‑resource containers.

Set -Xms and -Xmx to the same value to prevent runtime heap resizing pauses.

Reserve extra memory beyond the heap to accommodate native allocations and avoid hitting the container’s memory limit.

Conclusion

At low load, default JVM settings may appear adequate, but they quickly become a bottleneck as load rises.

On Kubernetes, explicitly configuring both the GC algorithm and fixed heap size consistently improves TPS and reduces tail latency, especially in resource‑constrained pods.

Overly strict CPU/memory limits adversely affect JVM behavior and GC parallelism, leading to unexpected performance degradation.

For production workloads, avoid relying on defaults; instead, apply the explicit JVM and resource‑limit guidelines described above.

JavaJVMPerformancecloud-nativeKubernetesgc
Tech Musings
Written by

Tech Musings

Capturing thoughts and reflections while coding.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.