Understanding Kubernetes 1.32 DRA: How Device Resource Allocation Works
This article explains the design of Kubernetes 1.32 Device Resource Allocation (DRA), detailing the four new CRDs, the roles of kube‑controller‑manager, kube‑scheduler and kubelet plugins, and the required admission webhook and RPC interfaces for managing device resources.
Overview
Four new CRDs are introduced in the resource.k8s.io API group:
DeviceClass – vendor‑provided class analogous to StorageClass.
ResourceSlice – records devices available on a node.
ResourceClaim – specifies quantity and required capabilities of devices.
ResourceClaimTemplate – template for creating ResourceClaims.
The kube‑controller‑manager includes a controller that creates a ResourceClaim from a ResourceClaimTemplate and automatically removes the allocation when the claim is no longer retained, making the underlying devices reusable.
The kube‑scheduler plugin must detect a Pod’s referenced ResourceClaim (directly or via a ResourceClaimTemplate) and ensure allocation completes before the Pod is scheduled.
DRA drivers require an optional admission webhook to validate opaque configuration parameters when creating ResourceClaims, ResourceClaimTemplates, or DeviceClasses, and a required kubelet plugin that publishes device information and prepares devices on the node.
ResourceSlice
Each node’s driver creates one or more ResourceSlice objects owned by the node. When the node fails, its ResourceSlices are deleted. All list‑type fields are atomic to simplify server‑side apply ownership.
type ResourceSliceSpec struct {
Driver string // driver name
Pool ResourcePool
NodeName string
NodeSelector *core.NodeSelector
AllNodes bool
Devices []Device
}
type ResourcePool struct {
Name string // unique pool name, usually node name
Generation int64
ResourceSliceCount int64 // total slices in this generation
}ResourceClaim
The scheduler must add the finalizer resource.kubernetes.io/delete-protection to a ResourceClaim before allocation can proceed.
type DeviceClaim struct {
Requests []DeviceRequest
Constraints []DeviceConstraint
Config []DeviceClaimConfiguration
}
type DeviceRequest struct {
Name string // reference name in pod.spec.containers[].resources.claims
DeviceClassName string
Selectors []DeviceSelector
AllocationMode DeviceAllocationMode // ExactCount (default) or All
Count int64 // used when AllocationMode is ExactCount, default 1
AdminAccess bool
}
type DeviceSelector struct {
CEL *CELDeviceSelector // CEL expression for device selection
}
type ResourceClaimStatus struct {
Allocation *AllocationResult
ReservedFor []ResourceClaimConsumerReference
}DeviceClass
type DeviceClassSpec struct {
Selectors []DeviceSelector
Config []DeviceClassConfiguration
}ResourceClaimTemplate
type ResourceClaimTemplateSpec struct {
metav1.ObjectMeta
Spec ResourceClaimSpec
}Managing Resources on Nodes
The kubelet must ensure that devices are available on the node before the first Pod that uses a particular device instance runs, and must release the devices after the last such Pod terminates. It does this by invoking the kubelet plugin RPCs NodePrepareResources and NodeUnprepareResources.
When the last Pod using a device finishes, NodeUnprepareResources must succeed before the Pod can be deleted, guaranteeing that network‑connected resources become reusable and that de‑allocation of the ResourceClaim is safe.
NodePrepareResources RPC
When a Pod that requests a specific resource is scheduled to a node, kubelet calls this RPC. The plugin assumes the call runs on the node that will use the resource and must produce CDI‑formatted JSON files for the allocated devices so that the runtime can update its configuration before container creation.
message NodePrepareResourcesRequest {
repeated Claim claims = 1; // list of ResourceClaims to prepare
}
message NodePrepareResourcesResponse {
map<string, NodePrepareResourceResponse> claims = 1;
}
message NodePrepareResourceResponse {
repeated Device devices = 1; // devices prepared for the claim
string error = 2;
}
message Device {
repeated string request_names = 1; // request names associated with this device
string pool_name = 2;
string device_name = 3;
repeated string cdi_device_ids = 4;
}The request_names field enables kubelet to locate the correct CDI ID for a container that uses a particular request rather than all devices in the claim.
NodeUnprepareResources RPC
This RPC is the inverse of NodePrepareResources. For each successful NodePrepareResources call, the plugin must be invoked at least once to undo the preparation work.
message NodeUnprepareResourcesRequest {
repeated Claim claims = 1;
}
message NodeUnprepareResourcesResponse {
map<string, NodeUnprepareResourceResponse> claims = 1;
}
message NodeUnprepareResourceResponse {
string error = 1;
}Reference: Kubernetes Enhancement Proposal 4381 – DRA Structured Parameters, https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters#design-details
Additional resource: resource.k8s.io – http://resource.k8s.io/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Infra Learning Club
Infra Learning Club shares study notes, cutting-edge technology, and career discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
