Cloud Native 4 min read

Enable GPU Acceleration in Kubernetes with NVIDIA Device Plugin

This guide explains how to set up NVIDIA drivers, install the NVIDIA device plugin, and create a Kubernetes pod that requests GPU resources, providing step‑by‑step commands and a sample YAML manifest for GPU‑enabled workloads.

Full-Stack DevOps & Kubernetes

Apr 5, 2023

Enable GPU Acceleration in Kubernetes with NVIDIA Device Plugin

Kubernetes is a popular container orchestration system that can schedule large‑scale containerized workloads. When applications require GPU resources, Kubernetes can expose those GPUs to pods, enabling high‑performance computing for machine‑learning or graphics workloads.

What is a GPU?

A GPU (Graphics Processing Unit) is hardware designed to accelerate graphics, image, and video processing. Unlike CPUs, GPUs can execute many parallel tasks, making them ideal for machine‑learning and deep‑learning workloads.

Using GPUs in Kubernetes

To schedule GPUs, you must install NVIDIA’s driver, CUDA toolkit, and the NVIDIA device plugin on at least one cluster node. The device plugin registers GPU resources (nvidia.com/gpu) with the Kubernetes scheduler.

Step‑by‑step guide

Deploy the NVIDIA device plugin:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.9.0/nvidia-device-plugin.yml

Create a pod specification that requests a GPU, for example:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
  - name: gpu-container
    image: nvidia/cuda:9.0-runtime
    resources:
      limits:
        nvidia.com/gpu: 1

Apply the pod definition: kubectl apply -f gpu-pod.yaml Verify that the pod is running and that a GPU has been allocated: kubectl describe pod gpu-pod If the command output shows an allocated nvidia.com/gpu resource, the pod is successfully using the GPU.

cloud-native Kubernetes GPU NVIDIA Container Orchestration Device Plugin

Written by

Full-Stack DevOps & Kubernetes

Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.