Configuring NVIDIA Docker Plugin and GPU Access in Kubernetes
This guide walks through installing the NVIDIA container toolkit, configuring Docker to use the NVIDIA runtime, verifying GPU access, deploying the NVIDIA device plugin in Kubernetes, labeling GPU nodes, and running a GPU‑accelerated FFmpeg pod to confirm successful GPU integration.
This guide demonstrates how to install the NVIDIA Docker plugin, configure Docker to use the NVIDIA runtime, and enable GPU acceleration for containers running in a Kubernetes cluster.
1. Environment check
Verify the operating system and Kubernetes client version:
# lsb_release -a<br/>Distributor ID: Ubuntu<br/>Description: Ubuntu 22.04.4 LTS<br/>Release: 22.04<br/>Codename: jammy<br/><br/># cat /etc/redhat-release<br/>Rocky Linux release 9.3 (Blue Onyx)<br/><br/># kubectl version<br/>Client Version: v1.30.2<br/>Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3<br/>Server Version: v1.25.16<br/>WARNING: version difference between client (1.30) and server (1.25) exceeds the supported minor version skew of +/-12. Install NVIDIA Docker plugin
Add the NVIDIA repository, enable experimental packages, and install the toolkit:
# curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list<br/><br/># sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list<br/><br/># sudo apt-get update<br/><br/># sudo apt-get install -y nvidia-container-toolkit3. Configure Docker to use the NVIDIA runtime
# sudo nvidia-ctk runtime configure --runtime=docker<br/>INFO[0000] Loading config from /etc/docker/daemon.json<br/>INFO[0000] Wrote updated config to /etc/docker/daemon.json<br/>INFO[0000] It is recommended that docker daemon be restarted.The command adds a runtimes section to /etc/docker/daemon.json:
{<br/> "insecure-registries": ["192.168.3.61"],<br/> "registry-mirrors": ["https://7sl94zzz.mirror.aliyuncs.com", "https://hub.atomgit.com", "https://docker.awsl9527.cn"],<br/> "runtimes": {<br/> "nvidia": {<br/> "path": "nvidia-container-runtime",<br/> "args": []<br/> }<br/> }<br/>}Restart Docker: # systemctl daemon-reload<br/># systemctl restart docker 4. Verify Docker GPU access
# docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smiThe container should display detailed GPU information, confirming that Docker can access the NVIDIA GPU.
5. Deploy NVIDIA device plugin in Kubernetes
# kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.1/deployments/static/nvidia-device-plugin.ymlKey parts of the nvidia-device-plugin.yml (DaemonSet) include the tolerations for nvidia.com/gpu and the container image nvcr.io/nvidia/k8s-device-plugin:v0.16.1.
Check plugin logs for errors such as missing NVIDIA runtime configuration.
6. Label GPU nodes and schedule pods # kubectl label nodes aiserver003087 gpu=true Update the pod specification to use a nodeSelector so that it runs only on nodes with the gpu=true label.
Example GPU test pod ( gpu_test.yaml):
apiVersion: v1<br/>kind: Pod<br/>metadata:<br/> name: ffmpeg-pod<br/>spec:<br/> containers:<br/> - name: ffmpeg-container<br/> image: nightseas/ffmpeg:latest<br/> command: ["/bin/bash", "-c", "tail -f /dev/null"]<br/> resources:<br/> limits:<br/> nvidia.com/gpu: 1<br/> nodeSelector:<br/> gpu: "true"Create the pod, copy a video file into it, and run an FFmpeg command that uses GPU acceleration:
# kubectl cp test.mp4 ffmpeg-pod:/root<br/># kubectl exec -it ffmpeg-pod bash<br/># ffmpeg -hwaccel cuvid -c:v h264_cuvid -i test.mp4 -vf scale_npp=1280:720 -vcodec h264_nvenc out.mp4If out.mp4 is produced successfully, the GPU is being utilized by the container.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
