Demystifying Kubernetes Persistent Storage: Concepts, Components, and End‑to‑End Workflow
This article explains the fundamentals of Kubernetes persistent storage, defines key terms such as PV, PVC, StorageClass and CSI, describes static and dynamic volume creation with NFS examples, and provides a detailed step‑by‑step walkthrough of the provisioning, attaching, mounting, and deletion processes inside the Kubernetes control plane.
1. Terminology
in‑tree : storage logic lives inside the Kubernetes core repository.
out‑of‑tree : storage logic is external to the core repository, enabling decoupled plugins.
PV (PersistentVolume): a cluster‑level resource created by an administrator or an external provisioner; its lifecycle is independent of any pod.
PVC (PersistentVolumeClaim): a namespace‑level resource created by a user or a StatefulSet; it requests storage size and access mode.
StorageClass : a cluster‑level resource that defines a template for dynamically provisioning volumes, including quality‑of‑service parameters and backup policies.
CSI (Container Storage Interface): a standardized plugin interface that allows storage vendors to integrate with Kubernetes and other orchestrators.
2. Component Overview
PV Controller : watches PVCs, binds them to PVs, and handles provisioning and deletion.
AD Controller : performs Attach/Detach operations, connecting volumes to nodes.
Kubelet : node‑level agent responsible for pod lifecycle, health checks, and volume management.
Volume Manager : part of Kubelet; mounts/unmounts volumes and may also handle attach/detach.
Volume Plugins : in‑tree or out‑of‑tree plugins supplied by storage vendors to manage specific volume types.
External Provisioner : sidecar container that calls Volume Plugins via gRPC to create/delete volumes when the PV controller cannot do so directly.
External Attacher : sidecar container that calls Volume Plugins via gRPC to attach/detach volumes.
3. Using Persistent Volumes
Kubernetes separates volume creation from consumption. Two approaches exist:
Static creation: the cluster admin manually creates a PV.
Dynamic creation: a user creates a PVC; a provisioner automatically creates a matching PV.
Example using an NFS share (static):
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
nfs:
server: 192.168.4.1
path: /nfs_storageCorresponding PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10GiAfter applying the manifests, kubectl get pvc shows the PVC bound to the PV. A pod can then consume the claim:
apiVersion: v1
kind: Pod
metadata:
name: test-nfs
spec:
containers:
- image: nginx:alpine
name: nginx
volumeMounts:
- mountPath: /data
name: nfs-volume
volumes:
- name: nfs-volume
persistentVolumeClaim:
claimName: nfs-pvcDynamic creation (requires an NFS‑client provisioner and a StorageClass):
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: nfs-sc
provisioner: example.com/nfs
mountOptions:
- vers=4.1User creates a PVC that references the StorageClass:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs
annotations:
volume.beta.kubernetes.io/storage-class: "example-nfs"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Mi
storageClassName: nfs-scThe external provisioner then creates a PV and binds it to the PVC automatically.
4. Process Overview
Process diagram adapted from a cloud‑native storage course.
User creates a Pod that includes a PVC requesting dynamic storage.
Scheduler places the Pod on a suitable node.
PV Controller watches the pending PVC, invokes a Volume Plugin (in‑tree or out‑of‑tree) to provision a PV.
AD Controller detects the pending attach state and calls a Volume Plugin to attach the volume to the target node.
On the node, Volume Manager (part of Kubelet) waits for the attach to finish, then mounts the volume into a global directory (e.g., /var/lib/kubelet/pods/.../volumes/kubernetes.io~iscsi/...).
Kubelet starts the container(s) and bind‑mounts the global directory into the container’s filesystem.
5. Detailed Workflow
Provisioning
The provisioning phase is split into three stages: Provision/Delete , Attach/Detach , and Mount/Unmount .
PV Controller runs two workers:
ClaimWorker : processes PVC add/update/delete events and drives PVC state transitions.
VolumeWorker : updates PV status.
PV status transitions:
Available → Bound (when bound to a PVC).
Bound → Released (when the PVC is deleted).
Released → Available (if reclaim policy is Retain or after manual cleanup).
Any failure → Failed.
PVC status transitions:
Pending → Bound (when a matching PV is found).
Bound → Lost (if the bound PV is deleted).
Lost → Bound (if a new PV with the same name appears).
Static provisioning (FindBestMatch) :
DelayBinding : determines whether binding should be delayed based on PVC annotations and StorageClass volumeBindingMode.
FindBestMatchPVForClaim : filters existing PVs by volume mode, availability, labels, and StorageClass, then selects the smallest PV that satisfies the request.
Bind : updates .spec.claimRef on the PV, sets both PV and PVC status to Bound, and adds controller annotations.
Dynamic provisioning (ProvisionVolume) :
Before Provisioning : PV controller checks whether the StorageClass is in‑tree (prefix kubernetes.io/) or out‑of‑tree and annotates the PVC with the provisioner name.
In‑tree provisioning : the internal provisioner implements NewProvisioner, creates a PV object, and binds it to the PVC.
Out‑of‑tree provisioning : the external provisioner validates the PVC, then calls the CSI CreateVolume RPC, creates a PV to represent the volume, and binds it.
Attaching Volumes
Both the AD controller and Kubelet can perform attach/detach. If Kubelet is started with --enable-controller-attach-detach, it handles the operation; otherwise the AD controller does.
Key data structures:
DesiredStateOfWorld (DSW) : the expected attach state (node → volume → pod).
ActualStateOfWorld (ASW) : the observed attach state.
Attachment flow (AD controller example):
Initialize DSW and ASW from cluster resources.
Reconciler periodically drives ASW toward DSW, performing attach operations when a volume is present in DSW but missing in ASW.
In‑tree attaching : AD controller calls the in‑tree Attacher’s Attach method.
Out‑of‑tree attaching : AD controller creates a VolumeAttachment object; the external attacher watches it and invokes the CSI ControllerPublishVolume RPC.
Detaching Volumes
When a pod is deleted, the AD controller checks for the volumes.kubernetes.io/keep-terminated-pod-volumes label. If absent, it removes the volume from DSW. The Reconciler then drives ASW toward DSW, invoking detach operations:
In‑tree detaching : AD controller calls the AttachableVolumePlugin’s Detach method.
Out‑of‑tree detaching : AD controller deletes the VolumeAttachment; the external attacher watches the deletion and calls the CSI ControllerUnpublishVolume RPC.
Mounting / Unmounting
Mounting uses a global mount path on the node so that a block device can be bind‑mounted into multiple pods. The process:
Wait for the volume to be attached (by external attacher or Kubelet).
Mount the volume to the global directory ( /var/lib/kubelet/pods/.../volumes/kubernetes.io~iscsi/...) using either the in‑tree DeviceMounter or the CSI NodeStageVolume RPC.
Bind‑mount the global directory into each pod’s container using SetUp (or CSI NodePublishVolume).
Update ASW to reflect the mounted state.
Unmounting follows the reverse order, ensuring that when a pod disappears the volume is unmounted from the pod, then from the global path, and finally detached if no longer needed.
6. Summary
The article first introduces the basic concepts of Kubernetes persistent storage and then provides an in‑depth analysis of the internal storage workflow. Regardless of the storage backend, every volume passes through the provisioning, attaching, and mounting stages, and any failure in these stages can be diagnosed by examining the corresponding controller (PV controller, AD controller, Kubelet) and their DesiredStateOfWorld/ActualStateOfWorld data structures.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
