Fundamentals 9 min read

How GPU Virtualization Powers Multi‑Tenant Computing and Cloud Graphics

GPU virtualization enables multiple tenants to share and isolate GPU resources across graphics rendering, high‑performance computing, and AI workloads, detailing software stack layers, user‑space API interception, kernel‑level device emulation, hardware support like SR‑IOV and MIG, and full GPU passthrough approaches.

Open Source Linux

Dec 18, 2024

How GPU Virtualization Powers Multi‑Tenant Computing and Cloud Graphics

1. GPU and Software Architecture

GPU can be used for graphics rendering; as a chip that accelerates graphics drawing it mainly targets PC and gaming markets. It can also be used for high‑performance computing (GPGPU) and codec scenarios.

The diagram abstracts the GPU subsystem in the software system into several conceptual layers; classic software architectures on GPU (without virtualization) apply to general‑purpose computing and graphics rendering scenarios.

Figure: Typical GPU software architecture (without virtualization)

2. GPU and Virtualization

Virtualization uses software to create an abstraction layer on computer hardware, allowing a single computer's hardware elements (CPU, memory, storage, etc.) to be divided into multiple virtual machines (VM). GPU virtualization simulates GPU resources in system software/hardware to support virtual‑machine solutions.

3. GPU Virtualization Requirements

These requirements are reflected in resource sharing and resource isolation.

Resource sharing: GPU performance is increasingly strong, requiring multi‑tenant (multiple containers and VMs) sharing. Scenarios include multi‑screen automotive, local desktop VMs, remote desktop (desktop virtualization), and cloud GPU VMs.

Resource isolation: Ensuring tenants do not affect each other, with scenarios such as video‑memory isolation, compute isolation, and fault isolation.

4. GPU Virtualization Technologies

Virtualization technologies are implemented at three levels: user level, kernel level, and hardware level. Based on application scenarios they are divided into isolation scenarios (containers and VMs) and hardware scenarios (virtual desktop, rendering, AI computing). Technologies can be classified as:

User level: API interception and API forwarding

Kernel level: GPU driver interception

Kernel level: GPU driver para‑virtualization (Para Virtualization)

Hardware level: hardware virtualization (Virtualization)

Hardware level: SR‑IOV (Single Root I/O Virtualization)

Hardware level: Nvidia MIG (Multi‑Instance GPU)

5. GPU User‑Level Virtualization

1) Local API interception and API forwarding

Implement a user‑space library, e.g., libwrapper, that provides all underlying library APIs.

Make the application call libwrapper → implementation uses the underlying dynamic library opened with dlopen. libwrapper intercepts function calls, parses them, and invokes the real underlying library function with the same name.

After the call, libwrapper returns the result to the application.

2) Remote API forwarding libwrapper calls underlying libraries on different machines via network. libwrapper becomes two parts: a client for forwarding and a server for receiving and invoking.

This enables GPU pooling (multiple GPUs form a pool accessed by multiple clients), allowing machines without GPUs to use GPU functionality.

3) Semi‑virtualization API forwarding

APP and libwrapper run inside a VM. libwrapper communicates via semi‑virtualization (virtio) to the host's underlying library.

The VM kernel implements a virtio frontend → optimization: VM and host share memory to accelerate data transfer.

The host hypervisor implements a virtio backend.

The host completes the underlying library call.

6. GPU Kernel‑Level Virtualization

1) Kernel module intercepts via device file

The kernel interception module simulates a device file; it forwards user process accesses to the real driver, parses kernel function returns, and returns them to user space.

Usually the underlying library accesses the GPU driver via a device file, e.g., /dev/realgpu.

Implement a kernel module that creates a simulated device file for user space, e.g., /dev/realgpu.

Bind‑mount the simulated device file into a container, masquerading as a real device file /dev/realgpu.

APP and the underlying library run inside the container, accessing the masqueraded device file; all accesses are intercepted by the kernel module.

2) Driver semi‑virtualization

User processes access virtualized interfaces provided by the hypervisor to reach the real GPU driver.

The VM's GPU driver implements a semi‑virtualization interface, calling the host's actual GPU driver via a hypercall. The hypercall switches the guest to the hypervisor, which uses a driver proxy in the kernel to access the real GPU driver.

Example: automotive GPU virtualization based on a type‑1 hypervisor that supports multiple guests.

7. GPU Hardware‑Level Virtualization

Virtualization requires both software and hardware; hardware support includes:

CPU and memory hardware virtualization support

IOMMU support

DMA remapping and Interrupt remapping

Hardware isolation and page‑table mechanisms

8. GPU Full Virtualization

This scheme passes the entire GPU through to the VM; strictly speaking it is not virtualization because GPU resource sharing cannot be achieved.

The VM's GPU driver needs no modification; it accesses the real hardware directly.

Full GPU passthrough to the VM yields minimal performance loss.

Because resource sharing is impossible, it is generally not considered GPU virtualization.

Source: Architecture Technologists Alliance

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

GPU virtualization cloud GPU resource-sharing hardware-virtualization kernel-interception

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.