Operations 11 min read

Understanding Kubernetes vs VM Monitoring: CPU, Memory, Disk & Network

This article compares monitoring metrics for CPU, memory, disk, and network between traditional KVM-based servers and Kubernetes pods, explaining why their indicators differ, how resource isolation works, and what key metrics users should watch to diagnose performance bottlenecks.

Efficient Ops
Efficient Ops
Efficient Ops
Understanding Kubernetes vs VM Monitoring: CPU, Memory, Disk & Network

Applying Kubernetes monitoring has been underway for a while, and the monitoring system now provides Kubernetes pod‑related metrics and alert rules.

Because Kubernetes and traditional physical/virtual machines run in completely different environments, the monitoring metrics differ as well. Although the platform tries to unify these differences, users still encounter feedback about Kubernetes monitoring metrics.

This article explains the differences between physical/virtual machines (referred to as KVM) and Kubernetes from four perspectives—CPU, memory, disk, and network—to help users understand the underlying principles when using the monitoring product.

CPU differences are the most significant, dictated by Kubernetes' technical nature.

Memory differences exist but can largely be aligned with the KVM stack.

Network and disk differences are minor, with little extra learning cost.

CPU

In KVM scenarios, users focus on two metrics: CPU usage rate and CPU load.

High CPU load with low usage usually indicates a bottleneck in disk I/O.

High CPU usage with load far exceeding the number of cores shows severe CPU shortage.

In Kubernetes, the relevant metrics are CPU usage rate and CPU throttling time.

When CPU usage approaches or slightly exceeds 100% and throttling time is high, the pod lacks sufficient CPU resources and needs higher request or limit values.

The differences arise from distinct CPU isolation mechanisms and from differences between Linux native metrics and Kubernetes‑exposed metrics.

The monitoring system provides two items for CPU, illustrated below:

The following diagram shows a throttled application where CPU usage exceeds 100%:

CPU Usage Rate

For a single CPU core, time is divided into user code execution, kernel code execution, and idle time (HLT on x86).

User code execution time

Kernel code execution time

Idle time

In KVM, CPU usage rate is calculated as (user time + kernel time) / total time.

Kubernetes pods do not have dedicated cores, so the formula changes: a pod with a CPU limit of 4 can use up to 4 seconds of CPU per second, and a pod using 0.5 seconds per second is using 50% of a core. Kubernetes does not expose a native “usage rate” concept, but the monitoring system derives a pod CPU usage rate as usage / limit.

Because of limited granularity and measurement error, CPU usage may briefly exceed 100% under stress.

CPU Load

CPU load measures the number of runnable threads in Linux, including running threads and those in uninterruptible sleep (typically I/O). A low usage rate with high load suggests many threads are blocked on I/O, indicating a disk or network bottleneck.

Kubernetes provides a cpu_load metric that only counts running threads, losing the ability to detect I/O‑bound bottlenecks. Moreover, the platform disables the CPU load metric by default.

To compensate, Kubernetes offers a “CPU throttling time” metric, which captures situations where both usage and load are high, indicating severe CPU shortage.

Throttling works via the Completely Fair Scheduler (CFS) cgroup bandwidth control: each second is split into periods (e.g., 0.1 s). Pods request time slices in each period; if the request exceeds the pod’s limit, throttling time is recorded. High throttling time signals insufficient CPU resources.

Memory

Both KVM and Kubernetes use a memory usage rate metric, but they differ in what memory is counted as “used”.

In KVM, total‑minus‑available is used, but cache/buffer/slab memory impact varies by application.

Kubernetes lacks an available metric, so RSS memory is used as the used value.

The monitoring system offers several memory‑related items, illustrated below:

Linux’s

free

command reports used, cache/buffer, and available memory. Simply subtracting cache/buffer from used can be misleading because cache may be performance‑critical for some workloads (e.g., Kafka, Elasticsearch).

Kubernetes exposes three memory values:

MemUsed – similar to Linux’s used, includes cache.

WorkingSet – excludes “cold” cache data, slightly smaller than MemUsed.

RSS – cache‑free memory.

In practice, WorkingSet often appears high (around 90% for typical web apps), while RSS provides a more stable indicator for most scenarios. Users are encouraged to consult multiple metrics when diagnosing memory bottlenecks.

Disk / Network

Based on Linux cgroup isolation, disk and network metrics differ little between Kubernetes and KVM. The main distinction is that Kubernetes clusters are typically disk‑less unless persistent volumes are used, so disk space monitoring is less relevant.

For disk, the focus is on write performance metrics, shown below:

For network, the usual metrics are traffic and packet loss, illustrated below:

MonitoringperformanceKubernetesCPUMemory
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.