Enable GPU Support in Kubernetes with Containerd and NVIDIA Runtime
This guide walks through installing NVIDIA drivers, CUDA toolkit, nvidia-container-runtime, configuring Containerd, deploying the NVIDIA device plugin, and testing GPU access inside Kubernetes pods, providing a complete solution for GPU workloads on containerd‑based clusters.
Kubernetes previously used Docker via a dockershim component, but the dockershim is now deprecated. Kubernetes can request GPU resources for pods, enabling deep‑learning and blockchain workloads.
1. Install NVIDIA driver
Install the NVIDIA driver on the host, preferably using the script from the official NVIDIA website; ensure gcc and kernel headers are present.
Install gcc and kernel‑dev: sudo apt install gcc kernel-dev -y Download driver from the official site.
Select the operating system and version, then download.
Example download command:
$ wget https://www.nvidia.com/content/DriverDownload-March2009/confirmation.php?url=/tesla/450.80.02/NVIDIA-Linux-x86_64-450.80.02.run&type=TeslaInstall the driver:
$ chmod +x NVIDIA-Linux-x86_64-450.80.02.run && ./NVIDIA-Linux-x86_64-450.80.02.runVerify installation with nvidia-smi (output similar to the screenshot below).
2. Install CUDA toolkit
CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform.
Download the appropriate CUDA version from NVIDIA and add it to the PATH:
$ echo 'export PATH=/usr/local/cuda/bin:$PATH' | sudo tee /etc/profile.d/cuda.sh
$ source /etc/profile3. Install nvidia-container-runtime
The nvidia‑container‑runtime adds a hook that mounts GPU devices when the NVIDIA_VISIBLE_DEVICES environment variable is set.
Set the repository and GPG key:
$ curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-container-runtime/$(. /etc/os-release; echo $ID$VERSION_ID)/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.listInstall the runtime:
$ apt install nvidia-container-runtime -yConfigure Containerd to use NVIDIA runtime
If /etc/containerd does not exist, create it, then generate the default configuration and modify it to use the NVIDIA runtime.
$ mkdir /etc/containerd
$ containerd config default > /etc/containerd/config.toml
... edit /etc/containerd/config.toml to set runtime_type and runtime to nvidia-container-runtime ...
$ systemctl restart containerd4. Deploy NVIDIA device plugin
Apply the device‑plugin manifest:
$ kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.7.1/nvidia-device-plugin.ymlCheck the plugin logs to confirm it is running (sample log shown below).
5. Test GPU inside a pod
Pull a CUDA image and run nvidia-smi with --gpus 0 using ctr to verify host GPU visibility.
$ ctr images pull docker.io/nvidia/cuda:9.0-base
$ ctr run --rm -t --gpus 0 docker.io/nvidia/cuda:9.0-base nvidia-smi nvidia-smiCreate a pod manifest ( gpu-pod.yaml) that requests one GPU:
apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
spec:
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
image: "k8s.gcr.io/cuda-vector-add:v0.1"
resources:
limits:
nvidia.com/gpu: 1Deploy the pod and verify it completes successfully, then view the pod logs to see the CUDA vector‑add test output.
$ kubectl apply -f ./gpu-pod.yaml
$ kubectl get pod
$ kubectl logs cuda-vector-addThe logs show successful GPU computation, confirming that Kubernetes can schedule and run GPU‑accelerated workloads. Note that current Kubernetes scheduling is at the card level and GPU resources are exclusive to a single container.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
