Cloud Native 32 min read

Advanced Kubernetes Pod Scheduling: Node Selectors, Affinity, Taints & Probes

This guide explains how to control Kubernetes pod placement using nodeName, nodeSelector, node and pod affinity (hard and soft), taints and tolerations, as well as pod lifecycle features such as restart policies, init containers, lifecycle hooks, and liveness/readiness probes, with concrete YAML examples and commands.

Full-Stack DevOps & Kubernetes

Apr 27, 2021

Advanced Kubernetes Pod Scheduling: Node Selectors, Affinity, Taints & Probes

Pod node selection with nodeName and nodeSelector

Kubernetes schedules a pod to a random node by default. To force a pod onto a specific node, set the nodeName field in the pod spec. To schedule onto any node that carries a particular label, use nodeSelector together with node labels.

apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
  namespace: default
  labels:
    app: myapp
    env: dev
spec:
  nodeName: node2
  containers:
  - name: tomcat-pod-java
    ports:
    - containerPort: 8080
    image: tomcat:8.5-jre8-alpine
    imagePullPolicy: IfNotPresent
  - name: busybox
    image: busybox:latest
    command: ["/bin/sh","-c","sleep 3600"]

Label a node and use nodeSelector:

# Label a node
kubectl label nodes node1 disk=ceph

# Pod spec using nodeSelector
apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
  namespace: default
  labels:
    app: myapp
    env: dev
spec:
  nodeSelector:
    disk: ceph
  containers:
  - name: tomcat-pod-java
    ports:
    - containerPort: 8080
    image: tomcat:8.5-jre8-alpine
    imagePullPolicy: IfNotPresent

Node affinity (hard and soft)

Node affinity provides more expressive rules than nodeSelector. Use requiredDuringSchedulingIgnoredDuringExecution for hard affinity (must match) and preferredDuringSchedulingIgnoredDuringExecution for soft affinity (preferred but not required).

# Hard affinity example (zone=foo or zone=bar required)
apiVersion: v1
kind: Pod
metadata:
  name: pod-node-affinity-demo
  namespace: default
  labels:
    app: myapp
    tier: frontend
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: zone
            operator: In
            values:
            - foo
            - bar

Apply the pod and label a node to satisfy the rule:

kubectl apply -f pod-nodeaffinity-demo.yaml
kubectl label nodes node2 zone=foo
kubectl get pods -o wide | grep pod-node

Soft affinity example (preferred zone1=foo1 or bar1):

# Soft affinity example
apiVersion: v1
kind: Pod
metadata:
  name: pod-node-affinity-demo-2
spec:
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 60
        preference:
          matchExpressions:
          - key: zone1
            operator: In
            values:
            - foo1
            - bar1

Pod affinity and anti‑affinity

Pod affinity lets you co‑locate pods that share a label, while anti‑affinity spreads them across different topology domains. Both support hard ( requiredDuringSchedulingIgnoredDuringExecution) and soft ( preferredDuringSchedulingIgnoredDuringExecution) rules.

# Pod affinity (hard) – second pod must run on same node as a pod with app=myapp
apiVersion: v1
kind: Pod
metadata:
  name: pod-second
  labels:
    app: backend
    tier: db
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["sh","-c","sleep 3600"]
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - myapp
        topologyKey: kubernetes.io/hostname

# Pod anti‑affinity (hard) – second pod must NOT run on same node as a pod with app=myapp
apiVersion: v1
kind: Pod
metadata:
  name: pod-second
spec:
  containers:
  - name: busybox
    image: busybox:latest
    command: ["sh","-c","sleep 3600"]
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - myapp
        topologyKey: kubernetes.io/hostname

Taints and tolerations

Taints are placed on nodes to repel pods that do not tolerate them. Tolerations are added to pod specs to allow scheduling onto tainted nodes. Effects can be NoSchedule, NoExecute, or PreferNoSchedule.

# Add a production taint to node1
kubectl taint node node1 node-type=production:NoSchedule

# Pod toleration that matches the taint
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      release: canary
  template:
    metadata:
      labels:
        app: myapp
        release: canary
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v1
        ports:
        - name: http
          containerPort: 80
      tolerations:
      - key: "node-type"
        operator: "Equal"
        value: "production"
        effect: "NoExecute"
        tolerationSeconds: 3600

Changing the effect to NoSchedule or using operator: Exists makes the toleration match a broader set of taints.

Pod lifecycle features

Restart policy

The restartPolicy field (default Always) controls how the kubelet restarts containers after failure. Options are Always, OnFailure, and Never.

apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
spec:
  restartPolicy: Always
  containers:
  - name: tomcat-pod-java
    image: tomcat:8.5-jre8-alpine
    ports:
    - containerPort: 8080

Init containers

Init containers run to completion before the main containers start. They are executed sequentially and share the pod’s volumes.

# Example init container that prepares data
apiVersion: v1
kind: Pod
metadata:
  name: init-demo
spec:
  initContainers:
  - name: init-data
    image: busybox
    command: ["sh","-c","cp /source/data /app/data"]
    volumeMounts:
    - name: app-volume
      mountPath: /app
  containers:
  - name: main-app
    image: myapp:latest
    volumeMounts:
    - name: app-volume
      mountPath: /app
  volumes:
  - name: app-volume
    emptyDir: {}

Lifecycle hooks

Containers can define postStart and preStop hooks to run commands after start or before termination.

containers:
- name: war
  image: sample:v2
  lifecycle:
    postStart:
      exec:
        command: ["cp","/sample.war","/app"]
    preStop:
      httpGet:
        host: monitor.com
        path: /warning
        port: 8080
        scheme: HTTP

Probes (liveness and readiness)

Probes determine whether a container is healthy ( livenessProbe) or ready to receive traffic ( readinessProbe). They can use exec, httpGet, or tcpSocket actions.

# Liveness probe using exec
apiVersion: v1
kind: Pod
metadata:
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: busybox
    args: ["/bin/sh","-c","touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600"]
    livenessProbe:
      initialDelaySeconds: 10
      periodSeconds: 5
      exec:
        command: ["cat","/tmp/healthy"]

# Liveness probe using HTTP
apiVersion: v1
kind: Pod
metadata:
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: mydlqclub/springboot-helloworld:0.0.1
    livenessProbe:
      initialDelaySeconds: 20
      periodSeconds: 5
      timeoutSeconds: 10
      httpGet:
        scheme: HTTP
        port: 8081
        path: /actuator/health

# Readiness probe example for a SpringBoot app
apiVersion: v1
kind: Pod
metadata:
  name: springboot
spec:
  containers:
  - name: springboot
    image: mydlqclub/springboot-helloworld:0.0.1
    readinessProbe:
      initialDelaySeconds: 20
      periodSeconds: 5
      timeoutSeconds: 10
      httpGet:
        scheme: HTTP
        port: 8081
        path: /actuator/health
    livenessProbe:
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      httpGet:
        scheme: HTTP
        port: 8081
        path: /actuator/health

Each probe type also supports tcpSocket actions and configurable fields such as initialDelaySeconds, periodSeconds, timeoutSeconds, successThreshold, and failureThreshold to fine‑tune health checking behavior.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Pod Scheduling Probes Node Selector Affinity Taints

Written by

Full-Stack DevOps & Kubernetes

Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.