Cloud Native 8 min read

Understanding Kubernetes 1.32 DRA: How Device Resource Allocation Works

This article explains the design of Kubernetes 1.32 Device Resource Allocation (DRA), detailing the four new CRDs, the roles of kube‑controller‑manager, kube‑scheduler and kubelet plugins, and the required admission webhook and RPC interfaces for managing device resources.

Infra Learning Club
Infra Learning Club
Infra Learning Club
Understanding Kubernetes 1.32 DRA: How Device Resource Allocation Works

Overview

Four new CRDs are introduced in the resource.k8s.io API group:

DeviceClass – vendor‑provided class analogous to StorageClass.

ResourceSlice – records devices available on a node.

ResourceClaim – specifies quantity and required capabilities of devices.

ResourceClaimTemplate – template for creating ResourceClaims.

The kube‑controller‑manager includes a controller that creates a ResourceClaim from a ResourceClaimTemplate and automatically removes the allocation when the claim is no longer retained, making the underlying devices reusable.

The kube‑scheduler plugin must detect a Pod’s referenced ResourceClaim (directly or via a ResourceClaimTemplate) and ensure allocation completes before the Pod is scheduled.

DRA drivers require an optional admission webhook to validate opaque configuration parameters when creating ResourceClaims, ResourceClaimTemplates, or DeviceClasses, and a required kubelet plugin that publishes device information and prepares devices on the node.

ResourceSlice

Each node’s driver creates one or more ResourceSlice objects owned by the node. When the node fails, its ResourceSlices are deleted. All list‑type fields are atomic to simplify server‑side apply ownership.

type ResourceSliceSpec struct {
    Driver string                // driver name
    Pool   ResourcePool
    NodeName string
    NodeSelector *core.NodeSelector
    AllNodes bool
    Devices []Device
}

type ResourcePool struct {
    Name string                 // unique pool name, usually node name
    Generation int64
    ResourceSliceCount int64   // total slices in this generation
}

ResourceClaim

The scheduler must add the finalizer resource.kubernetes.io/delete-protection to a ResourceClaim before allocation can proceed.

type DeviceClaim struct {
    Requests []DeviceRequest
    Constraints []DeviceConstraint
    Config []DeviceClaimConfiguration
}

type DeviceRequest struct {
    Name string                     // reference name in pod.spec.containers[].resources.claims
    DeviceClassName string
    Selectors []DeviceSelector
    AllocationMode DeviceAllocationMode // ExactCount (default) or All
    Count int64                     // used when AllocationMode is ExactCount, default 1
    AdminAccess bool
}

type DeviceSelector struct {
    CEL *CELDeviceSelector          // CEL expression for device selection
}

type ResourceClaimStatus struct {
    Allocation *AllocationResult
    ReservedFor []ResourceClaimConsumerReference
}

DeviceClass

type DeviceClassSpec struct {
    Selectors []DeviceSelector
    Config []DeviceClassConfiguration
}

ResourceClaimTemplate

type ResourceClaimTemplateSpec struct {
    metav1.ObjectMeta
    Spec ResourceClaimSpec
}

Managing Resources on Nodes

The kubelet must ensure that devices are available on the node before the first Pod that uses a particular device instance runs, and must release the devices after the last such Pod terminates. It does this by invoking the kubelet plugin RPCs NodePrepareResources and NodeUnprepareResources.

When the last Pod using a device finishes, NodeUnprepareResources must succeed before the Pod can be deleted, guaranteeing that network‑connected resources become reusable and that de‑allocation of the ResourceClaim is safe.

NodePrepareResources RPC

When a Pod that requests a specific resource is scheduled to a node, kubelet calls this RPC. The plugin assumes the call runs on the node that will use the resource and must produce CDI‑formatted JSON files for the allocated devices so that the runtime can update its configuration before container creation.

message NodePrepareResourcesRequest {
    repeated Claim claims = 1; // list of ResourceClaims to prepare
}
message NodePrepareResourcesResponse {
    map<string, NodePrepareResourceResponse> claims = 1;
}
message NodePrepareResourceResponse {
    repeated Device devices = 1; // devices prepared for the claim
    string error = 2;
}
message Device {
    repeated string request_names = 1; // request names associated with this device
    string pool_name = 2;
    string device_name = 3;
    repeated string cdi_device_ids = 4;
}

The request_names field enables kubelet to locate the correct CDI ID for a container that uses a particular request rather than all devices in the claim.

NodeUnprepareResources RPC

This RPC is the inverse of NodePrepareResources. For each successful NodePrepareResources call, the plugin must be invoked at least once to undo the preparation work.

message NodeUnprepareResourcesRequest {
    repeated Claim claims = 1;
}
message NodeUnprepareResourcesResponse {
    map<string, NodeUnprepareResourceResponse> claims = 1;
}
message NodeUnprepareResourceResponse {
    string error = 1;
}

Reference: Kubernetes Enhancement Proposal 4381 – DRA Structured Parameters, https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters#design-details

Additional resource: resource.k8s.io – http://resource.k8s.io/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesSchedulerkubeletCRDDRADeviceResourceAllocation
Infra Learning Club
Written by

Infra Learning Club

Infra Learning Club shares study notes, cutting-edge technology, and career discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.