Understanding kubectl top: How Kubernetes Metrics Work and Common Issues
This article explains how kubectl top retrieves real‑time CPU and memory usage for nodes and pods, details the underlying data flow and metric‑server architecture, and addresses frequent errors such as missing components, pause‑container accounting, and differences from host top or docker stats.
1. Introduction
kubectl top provides real‑time resource usage of nodes and pods (CPU, memory). This article explains its data flow and implementation, describes the Kubernetes monitoring architecture, and addresses common problems.
Why does kubectl top report errors?
How does kubectl top node calculate values compared with the host top command?
How does kubectl top pod calculate values, and does it include the pause container?
Why do the values shown by kubectl top pod differ from those seen after exec‑ing into the pod?
Why do kubectl top pod values differ from docker stats?
Tested on Kubernetes 1.8 and 1.13.
2. Usage
kubectl top is a basic command but requires a metrics component to be deployed.
For versions < 1.8, deploy heapster.
For versions ≥ 1.8, deploy metrics‑server.
kubectl top node shows node usage; kubectl top pod shows pod usage. Without specifying a pod name, it lists all pods in the namespace; the
--containersflag shows metrics for each container.
Metric meanings:
CPU unit 100m = 0.1 core; memory unit 1Mi = 1024Ki.
Pod memory is the sum of all business containers, excluding the pause container, derived from the
container_memory_working_set_bytesmetric.
Node values are not the sum of all pod values and differ from the values shown by the host
topor
freecommands.
3. Implementation Principles
3.1 Data flow
kubectl top, the Kubernetes dashboard, and the HPA use the same metric data. The flow is:
When heapster is used, the apiserver proxies metric requests to the heapster service inside the cluster.
When metrics‑server is used, the apiserver accesses metrics via the
/apis/metrics.k8s.io/endpoint.
Comparison with
kubectl get podlogs is shown.
3.2 Metric API
Heapster uses proxy forwarding, which is unstable and version‑uncontrolled. It lacks full authentication and client integration. The metric API should be a first‑class resource, similar to other Kubernetes APIs.
Proxy is only for troubleshooting and is not stable.
Heapster cannot benefit from apiserver’s auth and generic server features.
Pod metrics are core for HPA and should be exposed as a resource (
metrics.k8s.io).
Since Kubernetes 1.8, heapster is being deprecated in favor of the metric API implemented by metrics‑server.
3.3 kube‑aggregator
Metrics‑server registers its API under
/apis/metrics.k8s.iovia kube‑aggregator, which extends the apiserver to allow custom API registration. It provides dynamic registration, discovery, aggregation, and secure proxying.
3.4 Monitoring system
Kubernetes defines two metric categories:
Core metrics : collected from Kubelet/cAdvisor and provided by metrics‑server for Dashboard and HPA.
Custom metrics : exposed via Prometheus Adapter (
custom.metrics.k8s.io) to support arbitrary Prometheus metrics.
Core metrics (CPU, memory) are sufficient for HPA; custom metrics enable scaling based on application‑specific signals such as request QPS or error rates.
3.5 Kubelet
Kubelet exposes summary metrics at
127.0.0.1:10255/metrics(or
10250after 1.11) and cAdvisor metrics at
/metrics/cadvisor. These endpoints provide node and pod aggregate data.
3.6 cAdvisor
cAdvisor, written in Go, collects container‑level stats (CPU, memory, network, filesystem) and exposes them via HTTP. It uses a memory storage and sysfs to gather data.
3.7 cgroup
All metric values ultimately come from cgroup files, e.g.,
/sys/fs/cgroup/memory/docker/[containerId]/memory.usage_in_bytes. Memory usage, limits, and usage ratios are derived from these files.
Memory.stat contains the most complete set of information.
4. Common Issues
No heapster or metrics‑server deployed, or the pod is unhealthy.
Pod was just created and metrics have not been collected yet (default 1 minute).
Check whether the kubelet read‑only port (10255) is open; use TLS port 10250 if necessary.
4.2 How is pod memory calculated? Does it include the pause container?
The pause container consumes a few megabytes, but cAdvisor’s pod memory list excludes it, so
kubectl top poddoes not count pause memory. The reported memory is
container_memory_working_set_bytes, calculated as
container_memory_usage_bytes – total_inactive_file.
4.3 How does kubectl top node differ from the host top command?
kubectl top node reports cgroup root statistics, not the sum of all pod metrics, and its calculation differs from the host
topoutput.
4.4 Why do pod top values differ from exec‑ing into the pod and running top?
Top inside a pod shows the host’s total resources; cgroup limits are not reflected, and shared memory is handled differently.
4.5 Why do kubectl top pod values differ from docker stats?
docker stats uses
container_memory_usage_bytes – container_memory_cache, which yields a smaller value than
kubectl topthat uses
container_memory_working_set_bytes.
<code>docker stats = container_memory_usage_bytes - container_memory_cache</code>5. Conclusion
In most cases you do not need to monitor node or pod usage manually because Cluster Autoscaler and HPA handle scaling. Persisting cAdvisor data with Prometheus is recommended for historical analysis and alerting. Note that storage support in kubectl top was added only after version 1.16, and earlier versions required heapster.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.