Cloud Native 9 min read

How Kubernetes 1.28 Improves Batch Jobs with Pod Replacement Policy and Per‑Index Backoff Limits

Kubernetes 1.28 adds two alpha features—Pod Replacement Policy and per‑index backoff limits—that let batch jobs replace terminating pods more intelligently and cap retries for each indexed pod, reducing resource waste and improving reliability for machine‑learning workloads.

Cloud Native Technology Community

Aug 23, 2023

Pod Replacement Policy

By default, when a pod enters a terminating state (e.g., due to preemption or eviction), Kubernetes immediately creates a replacement pod, so both pods run concurrently. This can cause problems for frameworks such as TensorFlow or JAX that allow only one pod per index to run at a time, leading to duplicate‑task errors.

How to enable and use

The feature is gated as an alpha feature. Enable it by turning on the JobPodReplacementPolicy feature gate in the cluster configuration.

After enabling, create a Job and set the podReplacementPolicy field (e.g., Failed) in the job spec:

kind: Job
metadata:
  name: new
spec:
  podReplacementPolicy: Failed
  ...

When the policy is set to Failed, a replacement pod is created only after the original pod reaches the Failed phase, not while it is merely terminating. The job’s .status.terminating field reports the number of pods that are currently terminating.

kubectl get jobs/myjob -o=jsonpath='{.items[*].status.terminating}'

This behavior is especially useful for external queue controllers (e.g., Kueue) that track the quota of running pods until resources are reclaimed from terminating jobs.

Per‑Index Backoff Limit

Normally, pod failures for indexed jobs count toward the global .spec.backoffLimit. If a particular index keeps failing, the whole job may be marked as failed before other indexes finish. The per‑index backoff limit lets you cap retries for each index independently.

How to enable and use

Enable the alpha feature gate JobBackoffLimitPerIndex. Then add the backoffLimitPerIndex field to the job spec:

apiVersion: batch/v1
kind: Job
metadata:
  name: job-backoff-limit-per-index-execute-all
spec:
  completions: 8
  parallelism: 2
  completionMode: Indexed
  backoffLimitPerIndex: 1
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: example
        image: python
        command:
        - python3
        - -c
        - |
          import os, sys, time
          id = int(os.environ.get("JOB_COMPLETION_INDEX"))
          if id == 1 or id == 2:
            sys.exit(1)
          time.sleep(1)

This job runs eight indexed completions with a parallelism of two. Indexes 1 and 2 deliberately fail once; because backoffLimitPerIndex is set to 1, they are not retried a second time.

After the job finishes, you can list the pods to see which indexes succeeded or failed:

kubectl get pods -l job-name=job-backoff-limit-per-index-execute-all

Typical output shows a mix of Completed and Error pods. To view the job’s overall status, run:

kubectl get jobs job-backoff-limit-per-index-fail-index -o yaml

The YAML includes fields such as completedIndexes, failedIndexes, succeeded, and failed. With the per‑index limit enabled, each failing index stops after its allowed retry, preventing the whole job from being marked failed due to a single problematic index.

The Cloud Native Technology Community, part of the CNBPA Cloud Native Technology Practice Alliance, focuses on evangelizing cutting‑edge cloud‑native technologies and practical implementations. It shares in‑depth content, case studies, and event/meetup information on containers, Kubernetes, DevOps, Service Mesh, and other cloud‑native tech, along with updates from the CNBPA alliance.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Pod Replacement Policy

How to enable and use

Per‑Index Backoff Limit

How to enable and use

Further Reading

Cloud Native Technology Community

How this landed with the community

Was this worth your time?

0 Comments