Cloud Native 12 min read

Configuring NVIDIA Docker Plugin and GPU Access in Kubernetes

This guide walks through installing the NVIDIA container toolkit, configuring Docker to use the NVIDIA runtime, verifying GPU access, deploying the NVIDIA device plugin in Kubernetes, labeling GPU nodes, and running a GPU‑accelerated FFmpeg pod to confirm successful GPU integration.

Java Tech Enthusiast

Jan 9, 2025

Configuring NVIDIA Docker Plugin and GPU Access in Kubernetes

This guide demonstrates how to install the NVIDIA Docker plugin, configure Docker to use the NVIDIA runtime, and enable GPU acceleration for containers running in a Kubernetes cluster.

1. Environment check

Verify the operating system and Kubernetes client version:

# lsb_release -a<br/>Distributor ID: Ubuntu<br/>Description: Ubuntu 22.04.4 LTS<br/>Release: 22.04<br/>Codename: jammy<br/><br/># cat /etc/redhat-release<br/>Rocky Linux release 9.3 (Blue Onyx)<br/><br/># kubectl version<br/>Client Version: v1.30.2<br/>Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3<br/>Server Version: v1.25.16<br/>WARNING: version difference between client (1.30) and server (1.25) exceeds the supported minor version skew of +/-1

2. Install NVIDIA Docker plugin

Add the NVIDIA repository, enable experimental packages, and install the toolkit:

# curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list<br/><br/># sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list<br/><br/># sudo apt-get update<br/><br/># sudo apt-get install -y nvidia-container-toolkit

3. Configure Docker to use the NVIDIA runtime

# sudo nvidia-ctk runtime configure --runtime=docker<br/>INFO[0000] Loading config from /etc/docker/daemon.json<br/>INFO[0000] Wrote updated config to /etc/docker/daemon.json<br/>INFO[0000] It is recommended that docker daemon be restarted.

The command adds a runtimes section to /etc/docker/daemon.json:

{<br/>    "insecure-registries": ["192.168.3.61"],<br/>    "registry-mirrors": ["https://7sl94zzz.mirror.aliyuncs.com", "https://hub.atomgit.com", "https://docker.awsl9527.cn"],<br/>    "runtimes": {<br/>        "nvidia": {<br/>            "path": "nvidia-container-runtime",<br/>            "args": []<br/>        }<br/>    }<br/>}

Restart Docker: # systemctl daemon-reload<br/># systemctl restart docker 4. Verify Docker GPU access

# docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

The container should display detailed GPU information, confirming that Docker can access the NVIDIA GPU.

5. Deploy NVIDIA device plugin in Kubernetes

# kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.1/deployments/static/nvidia-device-plugin.yml

Key parts of the nvidia-device-plugin.yml (DaemonSet) include the tolerations for nvidia.com/gpu and the container image nvcr.io/nvidia/k8s-device-plugin:v0.16.1.

Check plugin logs for errors such as missing NVIDIA runtime configuration.

6. Label GPU nodes and schedule pods # kubectl label nodes aiserver003087 gpu=true Update the pod specification to use a nodeSelector so that it runs only on nodes with the gpu=true label.

Example GPU test pod ( gpu_test.yaml):

apiVersion: v1<br/>kind: Pod<br/>metadata:<br/>  name: ffmpeg-pod<br/>spec:<br/>  containers:<br/>  - name: ffmpeg-container<br/>    image: nightseas/ffmpeg:latest<br/>    command: ["/bin/bash", "-c", "tail -f /dev/null"]<br/>    resources:<br/>      limits:<br/>        nvidia.com/gpu: 1<br/>  nodeSelector:<br/>    gpu: "true"

Create the pod, copy a video file into it, and run an FFmpeg command that uses GPU acceleration:

# kubectl cp test.mp4 ffmpeg-pod:/root<br/># kubectl exec -it ffmpeg-pod bash<br/># ffmpeg -hwaccel cuvid -c:v h264_cuvid -i test.mp4 -vf scale_npp=1280:720 -vcodec h264_nvenc out.mp4

If out.mp4 is produced successfully, the GPU is being utilized by the container.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Kubernetes Container Toolkit GPU GPU scheduling NVIDIA

Written by

Java Tech Enthusiast

Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.