Mastering Kubelet Eviction: How to Manage Node Resource Shortages in Kubernetes
This article explains the kubelet’s key responsibilities, how it detects resource scarcity through eviction signals and thresholds, and provides practical guidance on configuring hard and soft eviction policies, reclamation strategies, and QoS‑based pod ranking to keep Kubernetes nodes stable.
Kubelet Overview
The kubelet is the primary node component in Kubernetes, responsible for registering the node with the API server, watching scheduled Pods, launching containers via the container runtime, reporting container status, executing liveness probes, managing static Pods, and collecting node and container metrics.
Resource Exhaustion Handling
When a node runs out of resources such as disk, RAM, or CPU, the kubelet can evict Pods to preserve node stability. Administrators should understand best practices for configuring resource quotas to keep nodes flexible while maintaining overall fault tolerance.
How Kubelet Determines Resource Shortage
The kubelet uses eviction signals and eviction thresholds. An eviction signal reflects the current capacity of a resource (e.g., memory or storage), while the corresponding eviction threshold defines the minimum value that should be maintained. memory.available – memory signal; default eviction threshold is 100 Mi. nodefs.available – filesystem used for volumes and logs; eviction starts when usage exceeds 90 % (available < 10 %). nodefs.inodesFree – inode availability; eviction starts when free inodes drop below 5 %. imagefs.available – filesystem for container images; eviction starts when available space is < 15 %. imagefs.inodesFree – inode availability for imagefs; no default threshold.
These defaults can be overridden with kubelet flags to set custom eviction thresholds.
Hard vs. Soft Eviction
Hard eviction thresholds trigger immediate reclamation with no grace period. Soft thresholds include a user‑defined grace period that must expire before reclamation begins. --eviction-hard="memory.available<1Gi" sets a hard threshold for memory. For soft eviction with a 90‑second grace period:
--eviction-soft="memory.available<2Gi" --eviction-soft-grace-period=1m30sThe --eviction-max-pod-grace-period flag can limit the maximum grace period in seconds.
How Kubelet Reclaims Resources
The kubelet first tries to free unused container images or dead Pods. If the node has dedicated imagefs and nodefs filesystems, it evicts Pods based on which filesystem reaches its threshold: deleting invalid Pods for nodefs and unused images for imagefs. If imagefs is absent, the kubelet deletes invalid Pods first, then unused images.
If reclamation of images and Pods does not relieve pressure, the kubelet may evict user Pods as a last resort, ranking candidates by QoS class (Guaranteed, Burstable, Best‑Effort), pod priority, and resource requests.
Kubernetes QoS Classes
Guaranteed – CPU and RAM limits equal requests for all containers.
Burstable – Requests and limits differ for one or more containers.
Best‑Effort – No resource limits set.
Eviction Ranking Rules
Pods exceeding their resource requests are considered first.
If no pod exceeds its request, the kubelet evaluates pod priority, evicting lower‑priority pods before higher‑priority ones.
The first eviction candidates are Best‑Effort and Burstable pods that exceed requests; if multiple, they are sorted by priority and request size.
Finally, Guaranteed and Burstable pods that use less than their requests may be evicted, especially if system components need resources.
Minimum Reclaim
To avoid frequent small evictions, the --eviction-minimum-reclaim flag sets a minimum amount of resources to reclaim for each resource type.
--eviction-hard=memory.available<1Gi,nodefs.available<2Gi,imagefs.available<200Gi</code>
<code>--eviction-minimum-reclaim=memory.available=0Mi,nodefs.available=1Gi,imagefs.available=2GiThis ensures that after reclamation, at least 3 Gi of nodefs and 202 Gi of imagefs remain available.
Node Conditions and Eviction Pressure
When an eviction signal is triggered, the kubelet sets the corresponding node condition (e.g., MemoryPressure) and applies a taint to prevent new Pods from being scheduled on the affected node. Soft eviction thresholds with long grace periods can cause node conditions to oscillate, leading to scheduling uncertainty. The --eviction-pressure-transition-period flag defines how long the kubelet must wait before acting on a condition.
Simple Resource Shortage Handling Example
Assume a node with 10 Gi RAM, reserving 10 % for system daemons and aiming to evict Pods when 95 % of memory is used. The following kubelet flags achieve this:
--eviction-hard="memory.available<500Mi"
--system-reserved=memory=1.5GiThe system‑reserved value includes both the intended 10 % reservation (1 Gi) and the memory covered by the eviction threshold (0.5 Gi).
Conclusion
The article outlines practical Kubernetes administration techniques for customizing kubelet eviction behavior, allowing administrators to set custom thresholds and grace periods while emphasizing the responsibility that comes with this flexibility. Default eviction settings are generally sufficient, so any adjustments should be made cautiously.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
