Mastering Kubernetes StatefulSet: Architecture, Access, and Lifecycle Management
This article explains Kubernetes StatefulSet fundamentals, its headless service networking, access patterns, creation workflow, controller mechanics, and detailed procedures for updating, scaling, and deleting stateful pods with illustrative code examples.
1. Introduction to StatefulSet
StatefulSet is a workload object designed to manage stateful applications in Kubernetes. It controls a set of Pods with identical container specifications, providing each Pod with a stable, persistent identifier and storage. Unlike Deployments, each Pod retains a sticky ID throughout its lifecycle, enabling ordered deployment, scaling, and termination strategies required by stateful workloads.
StatefulSet pods use a Headless Service to define network identities, generating resolvable DNS records for intra‑StatefulSet communication.
2. Access Methods for StatefulSet Workloads
Access is similar to other workloads such as Deployments, but StatefulSets often rely on a Headless Service, which lacks a cluster IP and therefore does not create iptables/ipvs rules in kube‑proxy. Clients discover backend instances directly via DNS.
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # must match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # default 1
minReadySeconds: 10 # default 0
template:
metadata:
labels:
app: nginx # must match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-storage-class"
resources:
requests:
storage: 1Gi3. Creation Process of a StatefulSet
User issues a kubectl command to create a StatefulSet resource.
The API server authenticates and authorizes the request, then writes the object to etcd.
The StatefulSet controller watches etcd via a non‑blocking long‑running connection; any change triggers a fetch from the API server.
The API server returns the updated StatefulSet object to the controller.
The controller reconciles the desired replica count, creating Pods sequentially (0, 1, … N‑1) according to the template.
After each Pod creation, the API server updates etcd with the new Pod status.
4. StatefulSet Controller Working Principle
The controller relies on two key components:
Informer : watches the Kubernetes API for resource changes and updates a local cache.
Event Handler : callback that reacts to informer events and performs the necessary actions.
Manage Revision : tracks each update version of the StatefulSet for rollback.
Ordered Pod Management : ensures Pods start, update, and terminate in a defined order.
Replica Arrays : replicas holds IDs of healthy Pods; condemned holds IDs of Pods slated for removal.
4.1 Updating Pods
Determine the update strategy (OnDelete or RollingUpdate).
If OnDelete, manual Pod deletion triggers recreation.
If RollingUpdate, Pods are updated sequentially, waiting for each to terminate before proceeding.
After each update, verify Pod labels match the StatefulSet; mismatched Pods are deleted and recreated.
4.2 Scaling Pods
Scaling up:
Update the StatefulSet status; new Pods are added to the replicas queue.
Ensure existing Pods are Running or Ready; replace any failed Pods.
Validate label consistency after creation.
Scaling down:
Update the StatefulSet status.
Process the condemned queue, ensuring Pods are not in a terminating state; delete Pods in reverse order of their ordinal IDs.
Validate label consistency after deletion.
4.3 Deleting Pods
The core deletion logic is implemented in the processCondemned function:
func (ssc *defaultStatefulSetControl) processCondemned(ctx context.Context, set *apps.StatefulSet, firstUnhealthyPod *v1.Pod, monotonic bool, condemned []*v1.Pod, i int) (bool, error) {
logger := klog.FromContext(ctx)
if isTerminating(condemned[i]) {
if monotonic {
logger.V(4).Info("StatefulSet is waiting for Pod to Terminate prior to scale down",
"statefulSet", klog.KObj(set), "pod", klog.KObj(condemned[i]))
return true, nil
}
return false, nil
}
if !isRunningAndReady(condemned[i]) && monotonic && condemned[i] != firstUnhealthyPod {
logger.V(4).Info("StatefulSet is waiting for Pod to be Running and Ready prior to scale down",
"statefulSet", klog.KObj(set), "pod", klog.KObj(firstUnhealthyPod))
return true, nil
}
if !isRunningAndAvailable(condemned[i], set.Spec.MinReadySeconds) && monotonic && condemned[i] != firstUnhealthyPod {
logger.V(4).Info("StatefulSet is waiting for Pod to be Available prior to scale down",
"statefulSet", klog.KObj(set), "pod", klog.KObj(firstUnhealthyPod))
return true, nil
}
logger.V(2).Info("Pod of StatefulSet is terminating for scale down",
"statefulSet", klog.KObj(set), "pod", klog.KObj(condemned[i]))
return true, ssc.podControl.DeleteStatefulPod(set, condemned[i])
}This function checks termination status, running/ready conditions, and availability before safely deleting a Pod, respecting the ordered or parallel management strategy.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
