How to Debug Windows I/O Performance Issues in Virtualized Cloud Environments
This article walks through diagnosing and fixing severe Windows I/O performance degradation in a virtualized cloud host by using perf, systemtap, QEMU tracing, and kernel tweaks, revealing that excessive ACPI timer and APIC accesses cause costly VM exits and how enabling the Hyper‑V timer restores expected IOPS.
Introduction
As cloud computing services mature, more customers deploy workloads to the cloud. The added virtualization layer often introduces I/O performance problems that are hard to debug. This article shows how to use perf, systemtap and other tracing tools to investigate Windows I/O performance in a virtualized environment.
Problem Scenario
A hosted‑cloud customer built a virtual environment with a Windows 2008 R2 VM and a CentOS 6.5 VM on the same host. Using fio, the Windows VM achieved only ~18 K IOPS while the Linux VM reached ~100 K IOPS.
fio Configuration Used
[global]
ioengine=windowsaio
direct=1
iodepth=64
thread=1
size=20g
numjobs=1
[4k]
bs=4k
filename=d:test.img
rw=randreadTest Results
Cloud Host I/O Stack
The I/O stack in a cloud host spans the guest OS application, file system, block layer, drivers, the virtualization layer, and the host OS file system, block layer and drivers. Any bottleneck in these layers can degrade performance and makes tracing difficult.
Initial Diagnosis
Since the Linux VM performed well, problems in the host file system, block layer or drivers were ruled out. The focus shifted to the guest OS and the virtualization components.
QEMU Block‑IO Timing
Stefan Hajnoczi added tracing to QEMU, allowing measurement of the time from I/O request receipt to completion. The average I/O completion time was about 130 µs, indicating that QEMU itself was not the main latency source.
Investigating VirtIO Block Driver and Windows File System
Updating to the latest stable VirtIO‑Win driver did not resolve the issue, and no special configuration changes were made to the native Windows file system.
CPU Utilization Analysis
Using top -H -p 36256 showed the QEMU main thread consuming >90 % CPU, suggesting an “on‑CPU” problem. A perf recording ( perf record -a -g -p 36256 sleep 20) produced a flame graph where most CPU time was spent in KVM exit handling (vmx_handle_exit).
KVM Trace Events
Trace‑cmd revealed many KVM trace events; the most frequent were kvm:kvm_pio and kvm:kvm_mmio, indicating heavy I/O port and MMIO activity.
Identifying Hot I/O Ports and MMIO Ranges
Statistical analysis showed frequent accesses to I/O ports 0x608 and 0xc050 and MMIO range 0xFEE003xx. The 0xc050 accesses are generated by the VirtIO Block device.
Root Cause
Windows heavily reads the ACPI Power‑Management Timer and accesses APIC registers, causing many VM exits and consuming CPU cycles.
Mitigation Strategies
1. Reduce VM exits caused by ACPI PM Timer reads by enabling the Hyper‑V timer (paravirtualized clock) in the guest. 2. Reduce APIC MMIO exits by enabling the apic‑v feature in the host kernel.
Enabling the Hyper‑V timer in libvirt XML:
<clock ...>
<timer name='hypervclock' present='yes'/>
</clock>Result
After applying the Hyper‑V timer and kernel updates, the Windows VM’s fio test reached performance comparable to the Linux VM.
Conclusion
In virtualized environments, poor Windows I/O performance is often not due to the storage path itself but to virtualization‑induced overhead such as frequent VM exits caused by ACPI timer and APIC accesses. Proper paravirtualization features and kernel tuning can restore expected I/O throughput.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
UCloud Tech
UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
