How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide
This guide walks new contributors through the HAMI Kubernetes device‑management middleware, covering its core capabilities, repository structure, development environment setup, build steps, and testing procedures using kwok and fake‑gpu to simulate large‑scale GPU scheduling scenarios.
Project Overview
HAMI (formerly k8s-vGPU-scheduler) is a Kubernetes middleware that manages heterogeneous devices (GPU, NPU, MLU, DCU). It provides virtualization, device sharing, resource isolation and dynamic MIG slicing.
Key capabilities
Virtualization for multiple device types (see supported devices list [1]).
Device sharing via core‑usage percentage, memory (MB), hard limits on streaming multiprocessors; no code changes required; supports dynamic MIG slicing [2] (example [3]).
Hard isolation for NVIDIA GPU resources.
Supported devices
NVIDIA
寒武纪
海关
壁仞
摩尔线程
昇疼
沐曦
Repository structure
Two repositories: HAMI – core scheduling logic, monitoring, NVIDIA vGPU device‑plugin implementation. HAMI-Core – core logic for NVIDIA vGPU virtualization.
Directory layout (generated with tree -L 2 cmd):
cmd
├── device-plugin
│ └── nvidia
├── scheduler
│ ├── main.go
│ └── metrics.go
└── vGPUmonitor
├── build.sh
├── feedback.go
├── main.go
├── metrics.go
├── noderpc
├── testcollector
└── validation.goDevelopment workflow
HAMI requires a CUDA‑enabled environment; development should be performed on a GPU‑equipped machine.
Update the submodule to fetch hami-core code: $ git submodule update Disable Docker BuildKit to avoid image‑build issues: $ export DOCKER_BUILDKIT=0 Build the container image (example tag dev):
$ SHORT_VERSION=dev IMAGE=projecthami/hami bash ./hack/build.shAfter building, replace the image in the HAMI deployment or run the binary directly with the debug command for rapid iteration.
If changes are made only in hami-core and not merged, edit .gitmodules to point to the local path.
Increase log verbosity for hami-core by setting:
$ export LIBCUDA_LOG_LEVEL=4Testing
Scheduler testing with kwok
Use the kwok project to simulate many nodes [4]. Steps:
Install kwok per its official documentation.
Create a virtual node using a YAML that includes the annotation hami.io/node-nvidia-register with the desired device description.
When creating a Pod, tolerate the taint kwok.x‑k8s.io/node so the scheduler can place the pod.
Example node definition:
apiVersion: v1
kind: Node
metadata:
annotations:
node.alpha.kubernetes.io/ttl: "0"
kwok.x-k8s.io/node: fake
hami.io/node-nvidia-register: >-
GPU-0,10,15360,200,NVIDIA-Tesla
P4,0,true:
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: kwok-node-0
kubernetes.io/os: linux
kubernetes.io/role: agent
node-role.kubernetes.io/agent: ""
type: kwok
name: nvidia-vgpu-node
spec:
taints:
- effect: NoSchedule
key: kwok.x-k8s.io/node
value: fake
status:
allocatable:
cpu: 32
memory: 256Gi
pods: 110
nvidia.com/vgpu: 10
capacity:
cpu: 32
memory: 256Gi
pods: 110
nvidia.com/vgpu: 10
nodeInfo:
architecture: amd64
operatingSystem: linux
phase: RunningDevice‑plugin testing with fake‑gpu
Use the fake-gpu project to inject configured devices into the device‑plugin [5]. Steps:
Install fake-gpu according to its documentation.
Edit conf/fake-gpu.yaml to define the devices to be injected.
Combining fake-gpu with kwok enables large‑scale GPU functionality testing.
Current status
HAMI runs inside containers; VM support is not yet available.
References:
[1] Supported devices list: https://github.com/Project-HAMi/HAMi/blob/master/README_cn.md#%E6%94%AF%E6%8C%81%E7%9A%84%E8%AE%BE%E5%A4%87
[2] Dynamic MIG slicing: https://github.com/Project-HAMi/HAMi/blob/master/docs/dynamic-mig-support_cn.md
[3] Example YAML: https://github.com/Project-HAMi/HAMi/blob/master/examples/nvidia/dynamic_mig_example.yaml
[4] kwok repository: https://github.com/kubernetes-sigs/kwok
[5] fake-gpu repository: https://github.com/chaunceyjiang/fake-gpu
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Infra Learning Club
Infra Learning Club shares study notes, cutting-edge technology, and career discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
