Cloud Native 10 min read

How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide

This guide walks new contributors through the HAMI Kubernetes device‑management middleware, covering its core capabilities, repository structure, development environment setup, build steps, and testing procedures using kwok and fake‑gpu to simulate large‑scale GPU scheduling scenarios.

Infra Learning Club
Infra Learning Club
Infra Learning Club
How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide

Project Overview

HAMI (formerly k8s-vGPU-scheduler) is a Kubernetes middleware that manages heterogeneous devices (GPU, NPU, MLU, DCU). It provides virtualization, device sharing, resource isolation and dynamic MIG slicing.

Key capabilities

Virtualization for multiple device types (see supported devices list [1]).

Device sharing via core‑usage percentage, memory (MB), hard limits on streaming multiprocessors; no code changes required; supports dynamic MIG slicing [2] (example [3]).

Hard isolation for NVIDIA GPU resources.

Supported devices

NVIDIA

寒武纪

海关

壁仞

摩尔线程

昇疼

沐曦

Repository structure

Two repositories: HAMI – core scheduling logic, monitoring, NVIDIA vGPU device‑plugin implementation. HAMI-Core – core logic for NVIDIA vGPU virtualization.

Directory layout (generated with tree -L 2 cmd):

cmd
├── device-plugin
│   └── nvidia
├── scheduler
│   ├── main.go
│   └── metrics.go
└── vGPUmonitor
    ├── build.sh
    ├── feedback.go
    ├── main.go
    ├── metrics.go
    ├── noderpc
    ├── testcollector
    └── validation.go

Development workflow

HAMI requires a CUDA‑enabled environment; development should be performed on a GPU‑equipped machine.

Update the submodule to fetch hami-core code: $ git submodule update Disable Docker BuildKit to avoid image‑build issues: $ export DOCKER_BUILDKIT=0 Build the container image (example tag dev):

$ SHORT_VERSION=dev IMAGE=projecthami/hami bash ./hack/build.sh

After building, replace the image in the HAMI deployment or run the binary directly with the debug command for rapid iteration.

If changes are made only in hami-core and not merged, edit .gitmodules to point to the local path.

Increase log verbosity for hami-core by setting:

$ export LIBCUDA_LOG_LEVEL=4

Testing

Scheduler testing with kwok

Use the kwok project to simulate many nodes [4]. Steps:

Install kwok per its official documentation.

Create a virtual node using a YAML that includes the annotation hami.io/node-nvidia-register with the desired device description.

When creating a Pod, tolerate the taint kwok.x‑k8s.io/node so the scheduler can place the pod.

Example node definition:

apiVersion: v1
kind: Node
metadata:
  annotations:
    node.alpha.kubernetes.io/ttl: "0"
    kwok.x-k8s.io/node: fake
    hami.io/node-nvidia-register: >-
      GPU-0,10,15360,200,NVIDIA-Tesla
      P4,0,true:
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: kwok-node-0
    kubernetes.io/os: linux
    kubernetes.io/role: agent
    node-role.kubernetes.io/agent: ""
    type: kwok
  name: nvidia-vgpu-node
spec:
  taints:
  - effect: NoSchedule
    key: kwok.x-k8s.io/node
    value: fake
status:
  allocatable:
    cpu: 32
    memory: 256Gi
    pods: 110
    nvidia.com/vgpu: 10
  capacity:
    cpu: 32
    memory: 256Gi
    pods: 110
    nvidia.com/vgpu: 10
  nodeInfo:
    architecture: amd64
    operatingSystem: linux
  phase: Running

Device‑plugin testing with fake‑gpu

Use the fake-gpu project to inject configured devices into the device‑plugin [5]. Steps:

Install fake-gpu according to its documentation.

Edit conf/fake-gpu.yaml to define the devices to be injected.

Combining fake-gpu with kwok enables large‑scale GPU functionality testing.

Current status

HAMI runs inside containers; VM support is not yet available.

References:

[1] Supported devices list: https://github.com/Project-HAMi/HAMi/blob/master/README_cn.md#%E6%94%AF%E6%8C%81%E7%9A%84%E8%AE%BE%E5%A4%87

[2] Dynamic MIG slicing: https://github.com/Project-HAMi/HAMi/blob/master/docs/dynamic-mig-support_cn.md

[3] Example YAML: https://github.com/Project-HAMi/HAMi/blob/master/examples/nvidia/dynamic_mig_example.yaml

[4] kwok repository: https://github.com/kubernetes-sigs/kwok

[5] fake-gpu repository: https://github.com/chaunceyjiang/fake-gpu

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesGPU virtualizationopen-source contributionDevice Pluginfake-gpuHAMIkwok
Infra Learning Club
Written by

Infra Learning Club

Infra Learning Club shares study notes, cutting-edge technology, and career discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.