Tagged articles
15 articles
Page 1 of 1
Infra Learning Club
Infra Learning Club
Feb 26, 2025 · Cloud Native

How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide

This guide walks new contributors through the HAMI Kubernetes device‑management middleware, covering its core capabilities, repository structure, development environment setup, build steps, and testing procedures using kwok and fake‑gpu to simulate large‑scale GPU scheduling scenarios.

Device PluginGPU virtualizationHAMI
0 likes · 10 min read
How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide
System Architect Go
System Architect Go
Nov 6, 2024 · Cloud Native

How Kubernetes Extended Resources Enable Custom Scheduling (and Their Limits)

This article explains how Kubernetes Extended Resources let you define custom resource types, describes the creation, synchronization, and scheduling workflow, highlights the non‑real‑time allocatable status behavior, and discusses practical limitations and the role of Device Plugins and Operators.

Cluster ManagementCustom SchedulingDevice Plugin
0 likes · 6 min read
How Kubernetes Extended Resources Enable Custom Scheduling (and Their Limits)
Infra Learning Club
Infra Learning Club
Sep 16, 2024 · Cloud Native

Survey of GPU Sharing and Virtualization Solutions for Kubernetes

The article surveys open‑source GPU sharing and virtualization approaches for AI workloads, comparing soft isolation, CUDA‑level isolation, NVIDIA MPS, driver‑level isolation, GPU pooling and deep‑learning memory sharing, and highlights their architectures, isolation guarantees, and performance trade‑offs.

Device PluginGPUKubernetes
0 likes · 5 min read
Survey of GPU Sharing and Virtualization Solutions for Kubernetes
Infra Learning Club
Infra Learning Club
Sep 5, 2024 · Cloud Native

Deep Dive into Kubelet’s DeviceManager Source Code

This article explains how Kubernetes uses the device‑plugin framework to extend resources beyond CPU and memory, details the kubelet registration and allocation workflow, and walks through the relevant source code in pkg/kubelet/cm/devicemanager that builds the OCI spec.

CDIDRADevice Plugin
0 likes · 5 min read
Deep Dive into Kubelet’s DeviceManager Source Code
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Aug 30, 2024 · Industry Insights

How GPU Virtualization Powers AI and Cloud Computing: Techniques, Challenges, and Future Directions

This article examines the rapid rise of GPU virtualization as a solution for efficient GPU resource utilization in AI, big data, and high‑performance computing, detailing its concepts, implementation methods across user, kernel, and hardware layers, Kubernetes integration, real‑world use cases, challenges, and emerging research trends.

Device PluginGPU virtualizationKubernetes
0 likes · 25 min read
How GPU Virtualization Powers AI and Cloud Computing: Techniques, Challenges, and Future Directions
Cloud Native Technology Community
Cloud Native Technology Community
Mar 11, 2024 · Cloud Native

Harnessing Nvidia GPUs in Kubernetes: Virtualization, Scheduling & Best Practices

This article explains how to combine Nvidia GPUs with Kubernetes, covering CUDA toolkits, device plugins, GPU virtualization techniques such as Time‑Slicing, MPS and MIG, and advanced scheduling options like Volcano, while also outlining practical deployment steps and performance considerations.

Cloud NativeDevice PluginGPU virtualization
0 likes · 22 min read
Harnessing Nvidia GPUs in Kubernetes: Virtualization, Scheduling & Best Practices
MaGe Linux Operations
MaGe Linux Operations
Mar 5, 2024 · Cloud Native

How to Run GPU‑Accelerated AI Workloads on Kubernetes

This article explains how Kubernetes supports GPU workloads for AI and machine learning, covering device plugins, pod GPU requests, oversubscription, security isolation, cloud‑provider node setup, and protecting GPU nodes from non‑GPU pods.

AI workloadsCloud NativeDevice Plugin
0 likes · 8 min read
How to Run GPU‑Accelerated AI Workloads on Kubernetes
Alibaba Cloud Native
Alibaba Cloud Native
Jan 13, 2020 · Cloud Native

How to Manage GPU Resources in Kubernetes: From Containers to Device Plugins

This article explains why managing GPUs with Kubernetes improves cost efficiency and deployment speed, details how to containerize GPU workloads, build appropriate images, configure NVIDIA drivers, and use Kubernetes Device Plugins and Extend Resources to schedule and monitor GPU resources, while also discussing current limitations and community solutions.

Device PluginGPUKubernetes
0 likes · 18 min read
How to Manage GPU Resources in Kubernetes: From Containers to Device Plugins