Cloud Native 34 min read

Mastering Kubernetes Storage: PV/PVC, Controllers, FlexVolume & CSI Explained

This article provides a comprehensive guide to Kubernetes storage architecture, covering persistent volumes, claims, the roles of PV, AD, and Volume controllers, the FlexVolume plugin system, and the CSI framework with deployment and usage examples.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Mastering Kubernetes Storage: PV/PVC, Controllers, FlexVolume & CSI Explained

Introduction

Kubernetes storage is the foundation for stateful services, offering data persistence through volumes. The platform supports built‑in in‑tree volume plugins and an out‑of‑tree plugin mechanism that allows external storage solutions to integrate.

Mounting a Volume – Example Walkthrough

A StatefulSet YAML defines a PVC named disk-pvc with a storageClassName. The mounting process consists of six steps:

User creates a Pod that references the PVC.

PV Controller watches for unbound PVCs and attempts to bind them to suitable PVs, provisioning a new PV if none match.

Scheduler assigns the Pod to a Node based on selectors, affinities, and volume‑related predicates.

If the PV is not yet attached, the AD Controller invokes the Volume Plugin to attach the remote volume to the node device (e.g., /dev/vdb).

Volume Manager mounts the device, formats it if needed, and makes it available under a global path.

The mount is finally bound into the container’s filesystem.

Kubernetes Storage Architecture

PV Controller : Manages PV/PVC lifecycle, binding, provisioning, and deletion.

AD Controller : Handles attach/detach operations, maintaining DesiredStateOfWorld and ActualStateOfWorld.

Volume Manager : Executes mount/unmount, formatting, and global path handling on each node.

Volume Plugins : Provide concrete implementations for provision, attach, mount, etc., and are divided into In‑Tree (bundled with Kubernetes) and Out‑of‑Tree (e.g., FlexVolume, CSI).

PV Controller Implementation

The controller runs two workers:

ClaimWorker : Drives PVC state transitions using the pv.kubernetes.io/bind-completed label.

VolumeWorker : Manages PV state based on the presence of a ClaimRef and the PV’s ReclaimPolicy.

Binding follows a sequence of checks: VolumeMode, LabelSelector, StorageClassName, AccessMode, and Size.

AD Controller Details

It maintains two core objects:

DesiredStateOfWorld : Desired mount state for each volume.

ActualStateOfWorld : Current mount state observed in the cluster.

Two main loops run: desiredStateOfWorldPopulator syncs new PVCs/Pods into DesiredStateOfWorld. Reconcile compares Desired and Actual states, invoking attach/detach via Volume Plugins.

Volume Manager Mechanics

Operates similarly to the AD Controller but runs inside the Kubelet. It decides whether to perform attach/detach based on the --enable-controller-attach-detach flag.

Volume Plugins Management

Plugins are discovered via a filesystem watcher (e.g., /usr/libexec/kubernetes/kubelet-plugins/volume/exec/). The InitPlugins routine loads In‑Tree plugins and registers a Prober that watches for new plugin binaries, updating the plugin list dynamically.

FlexVolume Overview

FlexVolume is an out‑of‑tree plugin model that proxies calls to external executables. It implements interfaces such as init, GetVolumeName, Attach, WaitForAttach, MountDevice, Setup, TearDown, Detach, ExpandVolumeDevice, and NodeExpand. Unimplemented interfaces return a JSON error like:

{
  "status": "Not supported",
  "message": "error message"
}

FlexVolume plugins reside under /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ and communicate with Kubelet via standard input/output.

FlexVolume Mount Flow

The process includes:

Attach – remote API creates the storage device on the node.

MountDevice – formats and mounts the device to a global path.

Setup – binds the global path into the Pod’s filesystem.

File‑based volumes skip the Attach and MountDevice steps and only perform Setup/Teardown.

FlexVolume Code Example

A typical script parses the command‑line argument to dispatch to init, doMount, or doUnmount functions.

FlexVolume Usage

A FlexVolume PV template specifies driver, fsType, and options. Labels and nodeAffinity can be used for scheduling constraints.

CSI Introduction

CSI (Container Storage Interface) provides a vendor‑agnostic, container‑native storage plugin model. It consists of a Controller Server (handling Create, Delete, Attach, Detach) and a Node Server (handling NodeStageVolume, NodePublishVolume, etc.). Communication occurs over Unix sockets.

CSI System Structure

Controller Server : Implements CSI controller RPCs and works with external components such as Provisioner, Attacher, Resizer, Snapshotter.

Node Server : Runs as a DaemonSet on each node, handling mount/unmount via the Kubelet VolumeManager.

Node‑Driver‑Registrar : Registers CSI drivers with the kubelet, watching a directory for socket files and updating node annotations/labels.

CSI Objects

VolumeAttachment : Tracks the attachment state of a volume to a node.

CSIDriver : Describes driver capabilities (e.g., attachRequired, podInfoOnMount).

CSINode : Lists drivers installed on a node.

Node‑Driver‑Registrar Workflow

Plugin places a socket file in /var/lib/kubelet/plugins_registry.

Kubelet discovers the socket, calls GetPluginInfo on the CSI plugin.

Kubelet invokes NodeGetInfo to obtain driver details.

Kubelet updates node annotations/labels and creates a CSINode object.

External‑Attacher

Monitors VolumeAttachment objects; when the attachment state is false, it calls the CSI ControllerPublishVolume (attach) or ControllerUnpublishVolume (detach) RPCs.

CSI Deployment

The Controller Server runs as a Deployment (often with two replicas for HA). The Node Server runs as a DaemonSet on every node. The Node‑Driver‑Registrar runs as a sidecar container on each node to register the driver.

CSI Usage Example

A CSI PV definition includes driver, volumeHandle, volumeAttributes, and optional nodeAffinity. After deployment, a Pod can see the volume device (e.g., /dev/vdb) mounted at /data through the global and pod paths.

Additional CSI Features

Support for Secrets at different stages (provision, mount, etc.).

Topology‑aware scheduling via nodeAffinity.

Block‑mode volumes via volumeMode: Block.

Flags such as skipAttach and podInfoOnMount to fine‑tune driver behavior.

Recent CSI Enhancements

ExpandCSIVolumes – filesystem expansion.

VolumeSnapshotDataSource – snapshot support.

CSIInlineVolume – allows defining CSI volumes directly in a Pod spec.

Conclusion

The article covered three major areas: Kubernetes storage architecture (PV, PVC, controllers), FlexVolume plugin mechanics and usage, and CSI framework components, deployment, and advanced features. Understanding these concepts helps developers design, implement, and troubleshoot stateful workloads on Kubernetes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesstorageCSIPVPVCFlexVolume
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.