Inside Kubelet: How Pod Admission Works
This article dissects Kubelet's Pod admission pipeline, explaining how syncLoopIteration gathers pod data, how HandlePodAdditions invokes canAdmitPod, and how six registered admit handlers—Eviction, System Allowlist, Resource Allocation, Predicate, AppArmor, and Shutdown—evaluate each pod with concrete code examples and decision logic.
The Kubelet syncLoopIteration method receives updates from multiple channels, including configCh which watches the API server for new Pods via an informer. When a new Pod arrives, the HandlePodAdditions handler calls canAdmitPod to decide if the pod can be created. canAdmitPod builds a PodAdmitAttributes struct containing the target pod and the list of other running, non‑terminated pods. If the In‑Place Pod Vertical Scaling feature gate is enabled, it deep‑copies each existing pod, updates its resource allocation, and replaces the OtherPods slice with the updated copies. It then iterates over the Kubelet's admitHandlers collection, invoking each handler's Admit method. If any handler returns Admit: false, the function logs the denial with klog.InfoS and returns false together with the handler's reason and message; otherwise it returns true.
Registered Admit Handlers
Eviction Admit
System Allowlist Admit
Allocate Resources Admit
Predicate Admit
AppArmor Admit
Shutdown Admit
Eviction Admit
This handler rejects a pod when the node is under conditions that would jeopardize stability. It always admits critical pods (static or high‑priority pods). If the node has only a memory‑pressure condition, best‑effort pods are admitted only if they tolerate the memory‑pressure taint; otherwise the pod is denied with a formatted message.
func (m *managerImpl) Admit(ctx context.Context, attrs *lifecycle.PodAdmitAttributes) lifecycle.PodAdmitResult {
m.RLock()
defer m.RUnlock()
if len(m.nodeConditions) == 0 {
return lifecycle.PodAdmitResult{Admit: true}
}
if kubelettypes.IsCriticalPod(attrs.Pod) {
return lifecycle.PodAdmitResult{Admit: true}
}
nodeOnlyHasMemoryPressureCondition := hasNodeCondition(m.nodeConditions, v1.NodeMemoryPressure) && len(m.nodeConditions) == 1
if nodeOnlyHasMemoryPressureCondition {
notBestEffort := v1.PodQOSBestEffort != v1qos.GetPodQOS(attrs.Pod)
if notBestEffort {
return lifecycle.PodAdmitResult{Admit: true}
}
if corev1helpers.TolerationsTolerateTaint(attrs.Pod.Spec.Tolerations, &v1.Taint{Key: v1.TaintNodeMemoryPressure, Effect: v1.TaintEffectNoSchedule}) {
return lifecycle.PodAdmitResult{Admit: true}
}
}
return lifecycle.PodAdmitResult{Admit: false, Reason: Reason, Message: fmt.Sprintf(nodeConditionMessageFmt, m.nodeConditions)}
}System Allowlist Admit
This handler validates user‑specified sysctls. If a pod lacks a SecurityContext or its Sysctls slice is empty, the pod is admitted. Otherwise each sysctl is checked against a whitelist; a violation results in Admit: false with a reason of ForbiddenReason and a message indicating the offending sysctl.
func (w *patternAllowlist) Admit(_ context.Context, attrs *lifecycle.PodAdmitAttributes) lifecycle.PodAdmitResult {
pod := attrs.Pod
if pod.Spec.SecurityContext == nil || len(pod.Spec.SecurityContext.Sysctls) == 0 {
return lifecycle.PodAdmitResult{Admit: true}
}
for _, s := range pod.Spec.SecurityContext.Sysctls {
if err := w.validateSysctl(s.Name, pod.Spec.HostNetwork, pod.Spec.HostIPC); err != nil {
return lifecycle.PodAdmitResult{Admit: false, Reason: ForbiddenReason, Message: fmt.Sprintf("forbidden sysctl: %v", err)}
}
}
return lifecycle.PodAdmitResult{Admit: true}
}Allocate Resources Admit
This handler ensures that the pod’s requested resources—CPU, memory, device plugins, etc.—can be allocated. It delegates to the topologyManager and the respective managers ( cpuManager, memoryManager, deviceManager) which are registered as hint providers. The actual allocation logic is performed by each provider’s Allocate method; any error causes admission to fail.
func (s *scope) allocateAlignedResources(ctx context.Context, pod *v1.Pod, container *v1.Container) error {
for _, provider := range s.hintProviders {
if err := provider.Allocate(ctx, pod, container); err != nil {
return err
}
}
return nil
}Predicate Admit
This handler re‑evaluates the scheduler’s predicate results. It obtains node information, checks OS selector constraints, validates init‑container restart policies when the SidecarContainers feature is disabled, and updates plugin resources. It then runs general predicates; if any fail, the first failure’s name and error message are returned as the admission reason.
func (w *predicateAdmitHandler) Admit(_ context.Context, attrs *PodAdmitAttributes) PodAdmitResult {
node, err := w.getNodeAnyWayFunc()
if err != nil {
return PodAdmitResult{Admit: false, Reason: "InvalidNodeInfo", Message: "Kubelet cannot get node info."}
}
// OS selector checks omitted for brevity
if err = w.pluginResourceUpdateFunc(nodeInfo, attrs); err != nil {
message := fmt.Sprintf("Update plugin resources failed due to %v, which is unexpected.", err)
return PodAdmitResult{Admit: false, Reason: "UnexpectedAdmissionError", Message: message}
}
// Remove missing extended resources, then run general predicates
podWithoutMissingExtendedResources := removeMissingExtendedResources(admitPod, nodeInfo)
reasons := generalFilter(podWithoutMissingExtendedResources, nodeInfo)
if len(reasons) > 0 {
r := reasons[0]
switch re := r.(type) {
case *PredicateFailureError:
return PodAdmitResult{Admit: false, Reason: re.PredicateName, Message: re.Error()}
case *InsufficientResourceError:
return PodAdmitResult{Admit: false, Reason: fmt.Sprintf("OutOf%s", re.ResourceName), Message: re.Error()}
default:
return PodAdmitResult{Admit: false, Reason: "UnexpectedPredicateFailureType", Message: fmt.Sprintf("GeneralPredicates failed due to %v, which is unexpected.", r)}
}
}
return PodAdmitResult{Admit: true}
}AppArmor Admit
For Linux nodes, this handler validates the pod against AppArmor profiles. Pods that are not in Pending state are automatically admitted. Otherwise the handler calls a.Validate; a nil error results in admission, while any error yields Admit: false with reason "AppArmor" and a formatted message.
func (a *appArmorAdmitHandler) Admit(_ context.Context, attrs *PodAdmitAttributes) PodAdmitResult {
if attrs.Pod.Status.Phase != v1.PodPending {
return PodAdmitResult{Admit: true}
}
if err := a.Validate(attrs.Pod); err == nil {
return PodAdmitResult{Admit: true}
}
return PodAdmitResult{Admit: false, Reason: "AppArmor", Message: fmt.Sprintf("Cannot enforce AppArmor: %v", err)}
}Shutdown Admit
If the node is in the process of shutting down, this handler rejects every pod by returning Admit: false with predefined reason and message constants; otherwise it admits the pod.
func (m *managerImpl) Admit(ctx context.Context, attrs *lifecycle.PodAdmitAttributes) lifecycle.PodAdmitResult {
if m.ShutdownStatus() != nil {
return lifecycle.PodAdmitResult{Admit: false, Reason: nodeShutdownNotAdmittedReason, Message: nodeShutdownNotAdmittedMessage}
}
return lifecycle.PodAdmitResult{Admit: true}
}In summary, a pod’s admission in Kubelet proceeds through canAdmitPod, which sequentially invokes the six handlers listed above. Each handler implements a specific policy—node stability, sysctl whitelisting, resource allocation, scheduler predicates, AppArmor confinement, and node shutdown—returning a boolean decision together with a concise reason and message when admission is denied.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Infra Learning Club
Infra Learning Club shares study notes, cutting-edge technology, and career discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
