Understanding ARMv8 Virtualization Architecture: Exception Levels, Stage‑2 Translation, and Hypervisor Features
This article explains the ARMv8 virtualization architecture, covering its core characteristics, exception levels, Stage‑2 address translation, MMIO emulation, SMMU handling, trap‑and‑emulate mechanisms, virtual interrupts, generic timer virtualization, host extensions, nested virtualization, and the associated performance overheads.
1. Virtualization Characteristics
Virtualization provides isolation, high availability, workload balancing, and sandboxing for embedded smart‑cabin systems. Isolation lets a safety‑critical RTOS and a RichOS (e.g., Android) run side‑by‑side on a single SoC, while high‑availability enables seamless VM migration and workload balancing maximizes physical CPU utilization.
2. Exception Levels (EL)
ARMv8 defines four exception levels: EL0 (normal user), EL1 (privileged kernel), EL2 (hypervisor), and EL3 (secure monitor). In a typical setup, user applications run at EL0, the kernel at EL1, and the hypervisor at EL2. KVM, however, executes code in both EL2 and EL1.
3. Stage‑2 Translation
KVM uses a Stage‑2 translation scheme to enforce VM memory isolation. Guest OS performs Stage‑1 translation (VA → IPA) and the hypervisor adds Stage‑2 translation (IPA → PA). Each VM receives a VMID; TLB entries are tagged with VMID, allowing multi‑VM support. EL0 does not handle exceptions directly; they are processed at higher ELs.
Four EL switches occur when switching between VM and host, which is costly and a target for optimization.
4. MMIO Emulation
VMs see both memory and peripheral addresses. For passthrough devices, the hypervisor maps the physical device address into the IPA space. For virtual devices, Stage‑2 entries are marked as Fault; each access triggers a Stage‑2 fault, and the hypervisor emulates the device in the exception handler using registers ESR_EL2 and HPFAR_EL2.
5. System Memory Management Units (SMMU)
SMMU translates device‑initiated DMA addresses (VA → PA) similarly to an MMU but with a smaller page table. Like the hypervisor, SMMU supports two‑stage translation, enabling nested translation when both stages are enabled.
6. Trapping and Emulation
Hypervisors trap sensitive instructions to enforce isolation. For example, configuring HCR_EL2.TWI makes a WFI executed at EL0/EL1 trap to EL2, where the hypervisor can schedule another vCPU. Traps also allow presenting virtual register values (e.g., ID_AA64MMFR0_EL1) to the guest.
6.1 Virtual Register Values
By enabling a trap on a register read, the hypervisor intercepts the access, substitutes a virtual value, and returns it to the guest.
6.2 Avoiding Traps
Frequent traps incur high overhead; for rarely accessed registers (e.g., feature registers) the cost is acceptable, but performance‑critical paths should minimize trapping.
7. Virtual Interrupts
ARMv8 provides virtual IRQ, FIQ, and SErrors. Hypervisors route physical interrupts to EL2 (via HCR_EL2.IMO/FMO) and then inject virtual interrupts to the appropriate vCPU. Two methods exist: setting bits in HCR_EL2 to generate a virtual interrupt directly, or using the GICv2/v3 virtual CPU interface to forward physical interrupts as virtual ones.
7.3 Example of Interrupt Forwarding
A physical device raises an interrupt, GIC forwards it to EL2, the hypervisor identifies the target VM, configures the GIC to deliver a virtual interrupt to the vCPU, and finally returns control to the vCPU.
8. Generic Timer Virtualization
ARM provides a set of generic timers. Two clocks are visible to a VM: the EL1 physical clock (wall‑time) and the EL1 virtual clock (offset‑based). The hypervisor can hide time spent while the vCPU is not scheduled by adjusting the offset.
Choosing the physical clock benefits I/O‑heavy embedded workloads (the time spent in device drivers is accounted for), while the virtual clock is better for compute‑bound workloads.
9. Virtualization Host Extensions (VHE)
VHE allows the host OS to run directly at EL2, eliminating the extra EL1↔EL2 context switch. It is enabled by setting HCR_EL2.E2H (VHE enable) and HCR_EL2.TGE (select EL0 host vs. guest).
VHE raises security concerns because the host OS now runs at a higher privilege level, potentially exposing guest memory.
10. Nested Virtualization
ARMv8.3‑A introduces support for a guest hypervisor running at EL1, reducing the overhead of nesting compared to the older EL0 approach. New control bits (HCR_EL2.NV, NV1, NV2) and VNCR_EL2 allow the guest hypervisor to access EL2 registers via memory‑mapped structures, minimizing traps.
11. Virtualization Overhead
The overhead of switching between VM and hypervisor includes saving/restoring 31 general‑purpose registers, 32 SIMD registers, two stack pointers, and additional metadata. Using LDP/STP instructions, about 33 memory accesses are required per switch. The exact cost depends on platform and hypervisor design.
Reference: Armv8‑A Virtualization, Arm Architecture Reference Manual.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Linux Code Review Hub
A professional Linux technology community and learning platform covering the kernel, memory management, process management, file system and I/O, performance tuning, device drivers, virtualization, and cloud computing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
