Analyzing KVM Mode Switching: How the Initialization Infrastructure Works
The article provides a detailed technical analysis of Linux 5.9 on arm64 KVM, explaining how Host and Guest OSes are distinguished at EL1, how HCR_EL2 flags control mode switching, and how the EL2 exception vector and related registers are initialized during Linux boot to enable KVM virtualization.
1. What is mode switching
In a type‑2 KVM scenario the Host OS and Guest OS both run at EL1, but they differ in whether virtualization is enabled. The distinction is made by the values of EL2 control registers such as HCR_EL2, VTCR_EL2 and GICH_HCR. Setting HCR_EL2 to HCR_HOST_NVHE_FLAGS selects Host OS mode, while HCR_GUEST_FLAGS selects Guest OS mode.
#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
HCR_BSU_IS | HCR_FB | HCR_TAC | \
HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
HCR_FMO | HCR_IMO | HCR_PTW )
#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK)These flag sets determine whether EL1 executes as a non‑virtualized Host OS or as a virtualized Guest OS.
2. Mode switching infrastructure
All virtualization‑related registers are accessible only at EL2. Switching between Host and Guest modes requires configuring EL2 registers, notably the exception vector base register vbar_el2, the saved program status register spsr_el2, and the EL2 vector table itself. The article refers to this collection of registers and tables as the “mode‑switching infrastructure”.
3. Initialization of the infrastructure
Linux must be booted at EL2 for KVM to work. When using QEMU, the required command‑line includes virtualization=on:
qemu-system-aarch64 \
-machine virt,gic-version=2,virtualization=on,type=virt \
-cpu cortex-a57 \
-smp 1 \
-m 512M \
-nographic \
-kernel linux/arch/arm64/boot/Image \
-initrd busybox-1.36.1/_install/rootfs.cpio \
-append "rdinit=/linuxrc console=ttyAMA0"During early boot the primary entry point in arch/arm64/kernel/head.S calls el2_setup. The function performs several tasks:
Detects the current exception level; if it is EL1 it returns after setting sctlr_el1, otherwise it proceeds with EL2 initialization.
Enables the physical timer for the Host OS while keeping the virtual timer for the Guest OS.
Copies EL1 identification registers into EL2 equivalents ( vpidr_el2, vmpidr_el2).
Clears vttbr_el2 because the Host OS does not use stage‑2 translation.
Sets vbar_el2 and spsr_el2, stores the return address in elr_el2, writes BOOT_CPU_MODE_EL2 to w0, and finally executes eret to drop back to EL1.
/* arch/arm64/kernel/head.S */
SYM_FUNC_START(el2_setup)
...
/* Hypervisor stub */
7: adr_l x0, __hyp_stub_vectors
msr vbar_el2, x0
/* spsr */
mov x0, #(PSR_F_BIT | PSR_I_BIT | PSR_A_BIT | PSR_D_BIT |\
PSR_MODE_EL1h)
msr spsr_el2, x0
msr elr_el2, lr
mov w0, #BOOT_CPU_MODE_EL2 // This CPU booted in EL2
eret
SYM_FUNC_END(el2_setup)The vector table __hyp_stub_vectors is defined in arch/arm64/kernel/hyp-stub.S and initially contains only the el1_sync entry, which handles synchronous EL1 exceptions captured by EL2.
/* arch/arm64/kernel/hyp-stub.S */
.text
.pushsection .hyp.text, "ax"
.align 11
SYM_CODE_START(__hyp_stub_vectors)
ventry el2_sync_invalid // Synchronous EL2t
ventry el2_irq_invalid // IRQ EL2t
...
ventry el1_sync // Synchronous 64‑bit EL1
ventry el1_irq_invalid // IRQ 64‑bit EL1
...
SYM_CODE_END(__hyp_stub_vectors)During KVM initialization the vector base is updated twice: first by __hyp_set_vectors which installs __kvm_hyp_init, and later by __kvm_call_hyp which installs __kvm_hyp_vector. These tables ultimately point to the routine __do_hyp_init, which finalises the EL2 environment by writing the new vbar_el2 value and returning with eret.
/* arch/arm64/kvm/hyp/nvhe/hyp-init.S */
.text
.pushsection .hyp.idmap.text, "ax"
.align 11
SYM_CODE_START(__kvm_hyp_init)
ventry __invalid // Synchronous EL2t
...
ventry __do_hyp_init // Synchronous 64‑bit EL1
...
SYM_CODE_END(__kvm_hyp_init)Finally, after all these steps the EL2 exception vector base register vbar_el2 points to the fully initialised KVM infrastructure, completing the mode‑switching setup.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
