Cloud Native 12 min read

Enable NVIDIA GPU Access in Docker and Kubernetes with the NVIDIA Container Toolkit

This guide walks through checking system and software environments, installing and configuring the NVIDIA Docker plugin, verifying GPU access in Docker containers, deploying the NVIDIA device plugin on a Kubernetes cluster, creating GPU‑enabled pods, and troubleshooting common issues, all with concrete commands and configuration examples.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Enable NVIDIA GPU Access in Docker and Kubernetes with the NVIDIA Container Toolkit

First, verify the host OS and Kubernetes versions. Example commands show how to display the Ubuntu release ( # lsb_release -a) and the current kubectl and server versions ( # kubectl version), noting any version skew warnings.

Install the NVIDIA Docker plugin

On a node with GPU resources, add the NVIDIA repository and import its GPG key:

# curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Enable the experimental repository line:

# sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the package index and install the toolkit:

# sudo apt-get update
# sudo apt-get install -y nvidia-container-toolkit

Configure Docker to use the NVIDIA runtime

Run the configuration command, which updates /etc/docker/daemon.json to add an nvidia runtime entry:

# sudo nvidia-ctk runtime configure --runtime=docker

The resulting daemon.json contains sections such as

"runtimes": {"nvidia": {"path": "nvidia-container-runtime", "args": []}}

. Restart Docker to apply the changes:

# systemctl daemon-reload
# systemctl restart docker

Validate Docker GPU access

Run a test container that executes nvidia-smi:

# docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

The output lists GPU model, driver version, memory usage, and confirms that the container can see the NVIDIA hardware.

Deploy the NVIDIA device plugin in Kubernetes

Create the plugin DaemonSet using the official manifest:

# kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.1/deployments/static/nvidia-device-plugin.yml

The YAML defines a DaemonSet in the kube-system namespace with appropriate tolerations and a system-node-critical priority class. After deployment, check the pod logs to ensure the plugin started without errors. If the node lacks GPU resources or the Docker runtime is mis‑configured, the logs will contain messages such as “Incompatible strategy detected auto” and hints to verify the NVIDIA Container Toolkit installation.

Create a GPU‑enabled pod

Define a pod manifest that requests one GPU:

apiVersion: v1
kind: Pod
metadata:
  name: ffmpeg-pod
spec:
  nodeName: aiserver003087   # optional, specify a GPU node
  containers:
  - name: ffmpeg-container
    image: nightseas/ffmpeg:latest
    command: ["/bin/bash", "-c", "tail -f /dev/null"]
    resources:
      limits:
        nvidia.com/gpu: 1

Apply the manifest ( # kubectl apply -f gpu_test.yaml), copy a test video into the pod, and run an ffmpeg command that uses CUDA acceleration:

# ffmpeg -hwaccel cuvid -c:v h264_cuvid -i test.mp4 -vf scale_npp=1280:720 -vcodec h264_nvenc out.mp4

Successful conversion and the presence of out.mp4 confirm that the pod can use the GPU.

Label nodes and adjust DaemonSet for selective deployment

Label GPU nodes so that the DaemonSet only runs on them: # kubectl label nodes aiserver003087 gpu=true Update the DaemonSet (or pod) manifest to include a nodeSelector matching gpu: "true". Note that the selector value must be quoted, otherwise kubectl apply will reject the manifest.

Common pitfalls

If a node has no GPU, the plugin will report “No devices found”.

If the Docker runtime is not set to nvidia, containers will fail to access the GPU and the plugin logs will suggest checking the NVIDIA Container Toolkit configuration.

Ensure the daemon.json contains the correct "default-runtime": "nvidia" and "runtimes" sections.

Following these steps provides a reproducible workflow for enabling GPU acceleration in Docker containers and Kubernetes workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesContainer ToolkitGPUNvidiaffmpeg
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.