Cloud Computing 10 min read

Cut Migration Time by 60%: Baidu Cloud Deploys Intel Xeon 6 QAT‑Accelerated Live VM Migration

The article analyzes the challenges of large‑scale live VM migration, introduces Intel Xeon 6 CPU‑integrated QAT hardware acceleration, compares pre‑ and post‑QAT workflows, and reports a 60% reduction in migration time, 20% CPU savings, and sub‑10 ms downtime in Baidu Smart Cloud production.

Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Cut Migration Time by 60%: Baidu Cloud Deploys Intel Xeon 6 QAT‑Accelerated Live VM Migration

Background and Challenges

Live VM migration is a core capability of cloud platforms, used for host maintenance, load balancing, and fault avoidance. As VM memory size and dirty‑page rate increase, migration time grows and CPU, memory‑bandwidth, and network resources compete with running workloads, causing noticeable downtime.

Innovation: CPU‑Integrated QAT Acceleration

Traditional migration relies on CPU software for memory compression, consuming CPU cycles. Baidu Cloud partnered with Intel to use the QAT engine built into Intel Xeon 6 Granite Rapids (GNR) performance cores. QAT offloads compression/decompression, supports lz4, zlib, and provides parallel processing, reducing CPU load.

CPU load offload : compression/decompression moves from CPU cores to dedicated hardware.

High parallelism : multiple streams achieve higher throughput than software.

Algorithm compatibility : supports mainstream lz4, zlib.

Full‑path coverage : both compression and decompression on source and destination.

Workflow Comparison

Before QAT

During pre‑copy, the source host CPU performs dirty‑page detection and software compression (lz4/zlib). Compressed pages are sent over the network, then the destination host CPU decompresses and writes pages. CPU cycles for compression compete with guest workloads, especially for large‑memory VMs.

After QAT

In pre‑copy, the CPU only detects dirty pages; the QAT user‑mode library (qatzip) sends pages to the on‑chip QAT engine for parallel hardware compression. The smaller compressed payload travels faster. On the destination, QAT hardware decompresses and writes pages, while the CPU only handles VM pause/start control.

Production Deployment

Through joint engineering with Intel, Baidu Cloud addressed stability under high concurrency, optimized low‑efficiency migrations for large‑memory VMs, and integrated QAT acceleration into the cloud scheduler.

Results in Baidu Cloud

Migration time for a 64 GB VM dropped from 33 s to 12 s (≈60 % reduction).

Host CPU utilization during migration decreased by over 20 %.

Downtime shortened to the “ten‑millisecond” level, meeting high‑SLA requirements.

Network bandwidth usage fell due to higher compression efficiency, improving link stability.

Large‑memory workloads (caches, CDN, offline jobs) benefited most, achieving fast, non‑disruptive migration.

Conclusion

The QAT‑accelerated live migration has become a foundational capability in Baidu Smart Cloud, delivering high‑concurrency, low‑perceived migration for production workloads and paving the way for deeper CPU‑chip and cloud‑service co‑optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance optimizationCloud Computinghardware accelerationIntel QATVM live migration
Baidu Intelligent Cloud Tech Hub
Written by

Baidu Intelligent Cloud Tech Hub

We share the cloud tech topics you care about. Feel free to leave a message and tell us what you'd like to learn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.