Operations 9 min read

Why Does My Container Show 900% CPU? Uncovering JVM and Cgroup Mismatches

An experienced ops engineer investigates a night‑time Grafana alert showing 900% CPU usage, discovers a mismatch between JVM‑detected cores and container limits, explains the root cause, and presents a three‑step solution with code snippets, monitoring tweaks, and performance results.

Open Source Linux

Jul 25, 2025

Why Does My Container Show 900% CPU? Uncovering JVM and Cgroup Mismatches

Scene 1: Anomalous Monitoring Data

Abnormal Phenomena Overview

Login to the monitoring system reveals contradictory numbers: the host shows ~94% CPU usage while Prometheus reports 900% usage.

# htop display
Tasks: 245 total,  24 running, 221 sleeping
%Cpu(s): 94.2 us,  4.8 sy,  0.0 ni,  1.0 id
Load average: 8.47, 8.23, 7.89

# Prometheus monitoring shows
node_cpu_usage_ratio: 9.2 (920%)
container_cpu_usage_ratio: 8.7 (870%)

First suspicion: The system overall CPU usage is only 94%, why does the monitor show 900%?

Container Resource Configuration Check

# Kubernetes Deployment configuration
resources:
  requests:
    cpu: "2"
    memory: "4Gi"
  limits:
    cpu: "2"
    memory: "8Gi"

Container Internal Check

# Inside the container
$ nproc
8  # Host has 8 CPU cores

$ cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us
200000  # 200ms

$ cat /sys/fs/cgroup/cpu/cpu.cfs_period_us
100000  # 100ms

Key finding: The container is limited to 2 CPUs, but the JVM detects all 8 host cores.

Technical Deep Dive: JVM’s "Perception Bias"

How JVM Perceives CPU Resources

Before Java 8u131, the JVM obtains CPU information in a raw way:

// Simplified JVM internal logic
int availableProcessors = Runtime.getRuntime().availableProcessors();
// This method reads /proc/cpuinfo directly, ignoring cgroup limits

Core issue: JVM sees 8 CPUs while the container only has 2 CPU time slices.

Thread‑Pool Configuration Chain Reaction

The application uses a classic thread‑pool configuration based on the detected CPU count:

// Problematic code
int corePoolSize = Runtime.getRuntime().availableProcessors() * 2;
int maximumPoolSize = Runtime.getRuntime().availableProcessors() * 4;
ThreadPoolExecutor executor = new ThreadPoolExecutor(
    corePoolSize,    // 16 core threads
    maximumPoolSize, // 32 max threads
    60L, TimeUnit.SECONDS,
    new LinkedBlockingQueue<>(1000)
);

Result: 32 threads fiercely compete for only 2 CPU slices, causing massive context switches.

Monitoring Metric "Deception"

CPU Usage Calculation Formula

The key to understanding the problem is the CPU usage calculation:

Container CPU Usage = (Actual CPU time used / CPU limit time) × 100%

When the container limit is 2 cores but the demand far exceeds 2 cores:

Actual usage: 2000 ms (already at the limit)

Expected usage: 9000 ms (32 threads demand)

Displayed usage: (9000/2000) × 100% = 450%

This explains why the monitor shows over 100% CPU usage.

More Precise Monitoring Metrics

# More accurate container CPU pressure metric
(
  rate(container_cpu_usage_seconds_total[5m]) /
  (container_spec_cpu_quota / container_spec_cpu_period)
) * 100

# CPU throttling metric
rate(container_cpu_cfs_throttled_seconds_total[5m])

Solution: Three‑Pronged Approach

1. JVM Parameter Optimization (Immediate Effect)

# Tell JVM the real CPU core count
-XX:ActiveProcessorCount=2

# Enable container awareness (Java 8u191+)
-XX:+UseContainerSupport
-XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap

2. Application‑Level Refactor

public class ContainerAwareThreadPool {
    private static final int CPU_CORES = getCpuCores();

    private static int getCpuCores() {
        // Prefer JVM parameter
        String activeProcessorCount = System.getProperty("java.lang.Integer.IntegerCache.high");
        if (activeProcessorCount != null) {
            return Integer.parseInt(activeProcessorCount);
        }
        // Check container limits
        try {
            long quota = Files.lines(Paths.get("/sys/fs/cgroup/cpu/cpu.cfs_quota_us"))
                .mapToLong(Long::parseLong).findFirst().orElse(-1);
            long period = Files.lines(Paths.get("/sys/fs/cgroup/cpu/cpu.cfs_period_us"))
                .mapToLong(Long::parseLong).findFirst().orElse(100000);
            if (quota > 0) {
                return (int) Math.ceil((double) quota / period);
            }
        } catch (Exception e) {
            // fallback
        }
        return Runtime.getRuntime().availableProcessors();
    }
}

3. Infrastructure Adjustments

# Kubernetes configuration optimization
resources:
  requests:
    cpu: "2"
    memory: "4Gi"
  limits:
    cpu: "4"  # loosen CPU limit
    memory: "8Gi"

# JVM startup parameters
env:
  - name: JAVA_OPTS
    value: "-XX:+UseContainerSupport -XX:ActiveProcessorCount=2"

Effect Verification: Dramatic Data Comparison

Before Optimization

CPU Usage: 850%-950%
P95 Response Time: 8.5s
Threads: 32 workers
Context Switches: 45000/s
CPU Throttling: 85%

After Optimization

CPU Usage: 65%-80%
P95 Response Time: 180ms
Threads: 8 workers
Context Switches: 3200/s
CPU Throttling: 2%

Performance improved by more than 40×.

Deep Thinking: Containerization Pitfalls and Wisdom

Resource Awareness Issues: Applications cannot correctly perceive container limits.

Monitoring Complexity: Traditional metrics may mislead in container environments.

Optimization Experience Invalidated: Physical‑machine experience needs re‑evaluation.

Best‑Practice Checklist

Confirm JVM version supports container awareness.

Set the correct ActiveProcessorCount JVM flag.

Validate thread‑pool configuration against real CPU cores.

Establish container‑level monitoring metrics.

Test application behavior under CPU throttling scenarios.

Monitoring Alert Optimization

# Prometheus alert rule
alert: ContainerCpuThrottling
expr: rate(container_cpu_cfs_throttled_seconds_total[5m]) > 0.1
for: 2m
labels:
  severity: warning
annotations:
  summary: "Container CPU throttling"
  description: "{{ $labels.container }} CPU throttling rate exceeds 10%"

Final Thoughts: Growth Reflections for Technologists

This incident taught that in the cloud‑native era, operations are no longer just “add machines, tweak parameters”. We must deeply understand underlying principles, stay technically sensitive, and adopt systematic thinking from application to container to kernel.

Have you encountered similar containerization traps? Share your experience in the comments!

JVM Monitoring Kubernetes CPU

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.