Configuring NVIDIA Docker Plugin and GPU Access in Kubernetes
This guide walks through installing the NVIDIA container toolkit, configuring Docker to use the NVIDIA runtime, verifying GPU access, deploying the NVIDIA device plugin in Kubernetes, labeling GPU nodes, and running a GPU‑accelerated FFmpeg pod to confirm successful GPU integration.
This guide demonstrates how to install the NVIDIA Docker plugin, configure Docker to use the NVIDIA runtime, and enable GPU acceleration for containers running in a Kubernetes cluster.
1. Environment check
Verify the operating system and Kubernetes client version:
# lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
# cat /etc/redhat-release
Rocky Linux release 9.3 (Blue Onyx)
# kubectl version
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.25.16
WARNING: version difference between client (1.30) and server (1.25) exceeds the supported minor version skew of +/-12. Install NVIDIA Docker plugin
Add the NVIDIA repository, enable experimental packages, and install the toolkit:
# curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
# sudo apt-get update
# sudo apt-get install -y nvidia-container-toolkit3. Configure Docker to use the NVIDIA runtime
# sudo nvidia-ctk runtime configure --runtime=docker
INFO[0000] Loading config from /etc/docker/daemon.json
INFO[0000] Wrote updated config to /etc/docker/daemon.json
INFO[0000] It is recommended that docker daemon be restarted.The command adds a runtimes section to /etc/docker/daemon.json :
{
"insecure-registries": ["192.168.3.61"],
"registry-mirrors": ["https://7sl94zzz.mirror.aliyuncs.com", "https://hub.atomgit.com", "https://docker.awsl9527.cn"],
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"args": []
}
}
}Restart Docker:
# systemctl daemon-reload
# systemctl restart docker4. Verify Docker GPU access
# docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smiThe container should display detailed GPU information, confirming that Docker can access the NVIDIA GPU.
5. Deploy NVIDIA device plugin in Kubernetes
# kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.1/deployments/static/nvidia-device-plugin.ymlKey parts of the nvidia-device-plugin.yml (DaemonSet) include the tolerations for nvidia.com/gpu and the container image nvcr.io/nvidia/k8s-device-plugin:v0.16.1 .
Check plugin logs for errors such as missing NVIDIA runtime configuration.
6. Label GPU nodes and schedule pods
# kubectl label nodes aiserver003087 gpu=trueUpdate the pod specification to use a nodeSelector so that it runs only on nodes with the gpu=true label.
Example GPU test pod ( gpu_test.yaml ):
apiVersion: v1
kind: Pod
metadata:
name: ffmpeg-pod
spec:
containers:
- name: ffmpeg-container
image: nightseas/ffmpeg:latest
command: ["/bin/bash", "-c", "tail -f /dev/null"]
resources:
limits:
nvidia.com/gpu: 1
nodeSelector:
gpu: "true"Create the pod, copy a video file into it, and run an FFmpeg command that uses GPU acceleration:
# kubectl cp test.mp4 ffmpeg-pod:/root
# kubectl exec -it ffmpeg-pod bash
# ffmpeg -hwaccel cuvid -c:v h264_cuvid -i test.mp4 -vf scale_npp=1280:720 -vcodec h264_nvenc out.mp4If out.mp4 is produced successfully, the GPU is being utilized by the container.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.