Understanding Kubernetes vs VM Monitoring: CPU, Memory, Disk & Network
This article compares monitoring metrics for CPU, memory, disk, and network between traditional KVM-based servers and Kubernetes pods, explaining why their indicators differ, how resource isolation works, and what key metrics users should watch to diagnose performance bottlenecks.
Applying Kubernetes monitoring has been underway for a while, and the monitoring system now provides Kubernetes pod‑related metrics and alert rules.
Because Kubernetes and traditional physical/virtual machines run in completely different environments, the monitoring metrics differ as well. Although the platform tries to unify these differences, users still encounter feedback about Kubernetes monitoring metrics.
This article explains the differences between physical/virtual machines (referred to as KVM) and Kubernetes from four perspectives—CPU, memory, disk, and network—to help users understand the underlying principles when using the monitoring product.
CPU differences are the most significant, dictated by Kubernetes' technical nature.
Memory differences exist but can largely be aligned with the KVM stack.
Network and disk differences are minor, with little extra learning cost.
CPU
In KVM scenarios, users focus on two metrics: CPU usage rate and CPU load.
High CPU load with low usage usually indicates a bottleneck in disk I/O.
High CPU usage with load far exceeding the number of cores shows severe CPU shortage.
In Kubernetes, the relevant metrics are CPU usage rate and CPU throttling time.
When CPU usage approaches or slightly exceeds 100% and throttling time is high, the pod lacks sufficient CPU resources and needs higher request or limit values.
The differences arise from distinct CPU isolation mechanisms and from differences between Linux native metrics and Kubernetes‑exposed metrics.
The monitoring system provides two items for CPU, illustrated below:
The following diagram shows a throttled application where CPU usage exceeds 100%:
CPU Usage Rate
For a single CPU core, time is divided into user code execution, kernel code execution, and idle time (HLT on x86).
User code execution time
Kernel code execution time
Idle time
In KVM, CPU usage rate is calculated as (user time + kernel time) / total time.
Kubernetes pods do not have dedicated cores, so the formula changes: a pod with a CPU limit of 4 can use up to 4 seconds of CPU per second, and a pod using 0.5 seconds per second is using 50% of a core. Kubernetes does not expose a native “usage rate” concept, but the monitoring system derives a pod CPU usage rate as usage / limit.
Because of limited granularity and measurement error, CPU usage may briefly exceed 100% under stress.
CPU Load
CPU load measures the number of runnable threads in Linux, including running threads and those in uninterruptible sleep (typically I/O). A low usage rate with high load suggests many threads are blocked on I/O, indicating a disk or network bottleneck.
Kubernetes provides a cpu_load metric that only counts running threads, losing the ability to detect I/O‑bound bottlenecks. Moreover, the platform disables the CPU load metric by default.
To compensate, Kubernetes offers a “CPU throttling time” metric, which captures situations where both usage and load are high, indicating severe CPU shortage.
Throttling works via the Completely Fair Scheduler (CFS) cgroup bandwidth control: each second is split into periods (e.g., 0.1 s). Pods request time slices in each period; if the request exceeds the pod’s limit, throttling time is recorded. High throttling time signals insufficient CPU resources.
Memory
Both KVM and Kubernetes use a memory usage rate metric, but they differ in what memory is counted as “used”.
In KVM, total‑minus‑available is used, but cache/buffer/slab memory impact varies by application.
Kubernetes lacks an available metric, so RSS memory is used as the used value.
The monitoring system offers several memory‑related items, illustrated below:
Linux’s
freecommand reports used, cache/buffer, and available memory. Simply subtracting cache/buffer from used can be misleading because cache may be performance‑critical for some workloads (e.g., Kafka, Elasticsearch).
Kubernetes exposes three memory values:
MemUsed – similar to Linux’s used, includes cache.
WorkingSet – excludes “cold” cache data, slightly smaller than MemUsed.
RSS – cache‑free memory.
In practice, WorkingSet often appears high (around 90% for typical web apps), while RSS provides a more stable indicator for most scenarios. Users are encouraged to consult multiple metrics when diagnosing memory bottlenecks.
Disk / Network
Based on Linux cgroup isolation, disk and network metrics differ little between Kubernetes and KVM. The main distinction is that Kubernetes clusters are typically disk‑less unless persistent volumes are used, so disk space monitoring is less relevant.
For disk, the focus is on write performance metrics, shown below:
For network, the usual metrics are traffic and packet loss, illustrated below:
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.