How SR‑IOV Powers High‑Performance PCIe Passthrough in KVM/QEMU
This article explains the background, hardware and software principles of SR‑IOV, the roles of Physical and Virtual Functions, IOMMU address and interrupt remapping, VFIO interfaces, and the complete QEMU/KVM PCI‑passthrough workflow that enables data‑plane acceleration for virtual machines.
Background
SR‑IOV (Single Root I/O Virtualization) is defined by the PCI‑SIG in the Single Root I/O Virtualization and Sharing Specification (latest version 1.1). It provides each virtual machine (VM) with independent memory space, interrupts, and DMA streams, enabling multiple VFs to be created from a single physical PCIe device.
SR‑IOV Principle
Hardware Implementation
SR‑IOV introduces two function types: Physical Function (PF) and Virtual Function (VF). PF is a full‑featured PCIe function that manages SR‑IOV configuration structures and can be discovered and controlled like any other PCIe device. VF is a lightweight function that shares physical resources with its PF and has its own limited configuration space.
VF BAR space is a subset of the PF BAR space and must be mapped to system memory because VFs do not support I/O space. The specification defines fields such as SR‑IOV Control (bit 0 enables SR‑IOV), TotalVFs , NumVFs , First VF Offset , and VF Stride that configure the number and placement of VFs.
Software Support
In Linux, the drivers/pci/iov.c driver provides interfaces to configure the SR‑IOV Extended Capability. A PCI device must have both PF and VF drivers: the PF driver configures SR‑IOV, while the VF driver implements the device’s functional behavior (e.g., NIC or GPU). VFIO’s vfio‑pic driver offers a minimal VFIO‑compliant PCIe driver.
IO Virtualization Based on SR‑IOV
QEMU/KVM PCI Passthrough Framework
The virtualization stack consists of the following layers (top‑down):
PCIe Device (supports SR‑IOV)
IOMMU
VFIO
Hypervisor (QEMU/KVM)
VF Driver running inside the GuestOS
This stack enables a physical PCIe device to be directly accessed by a VM.
IOMMU Address Remapping
IOMMU provides address and interrupt remapping. Address remapping isolates DMA address spaces so that a device can only access memory regions assigned to it. It also translates Guest Physical Addresses (GPA) to Host Physical Addresses (HPA) during DMA operations.
Interrupt remapping intercepts PCIe device interrupts, checks whether they are remappable, and uses an Interrupt Remapping Table Entry (IRTE) to route the interrupt to the correct CPU.
VFIO Interface
VFIO exposes PCIe devices to user space via:
Container file descriptor ( /dev/vfio)
IOMMU group file descriptor ( /dev/vfio/N)
Device file descriptor obtained through ioctl on the group descriptor
The vfio_iommu_type1 driver implements IOMMU remapping for VFIO, and vfio-pci binds PCI devices to the VFIO framework.
QEMU/KVM PCI Passthrough Process
QEMU performs two main tasks:
Read PCIe device information via VFIO and obtain configuration and DMA details.
Create a virtual PCIe device for the VM, mapping the physical device’s registers and DMA resources into the guest.
During VM creation, QEMU uses KVM/EPT to establish GVA→GPA→HPA mappings, and it programs the IOMMU with HVA and GPA information so that DMA from the guest is translated to the host’s physical memory, achieving data‑plane acceleration.
Data‑Plane Acceleration (GPA→HPA Mapping)
When a guest driver issues DMA using a GPA, the IOMMU translates it to the corresponding HPA. QEMU forwards the GPA to the physical device via VFIO; the device’s DMA engine accesses the HPA directly, bypassing host mediation and providing near‑native performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
