Cloud Computing 10 min read

Cut Migration Time by 60%: How Baidu Cloud Scaled Intel Xeon 6 QAT‑Accelerated VM Live Migration

VM live migration in large cloud clusters suffers from high CPU load and long downtime; Baidu Cloud integrated Intel Xeon 6 processors with built‑in QuickAssist Technology to offload memory compression, achieving up to 60% reduction in migration duration, 20% lower CPU usage, and sub‑10 ms pause windows.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
Cut Migration Time by 60%: How Baidu Cloud Scaled Intel Xeon 6 QAT‑Accelerated VM Live Migration

Live migration is a core capability of cloud platforms and virtualization infrastructure, used for host maintenance, load balancing, and fault avoidance. As workloads grow, migration time lengthens due to repeated memory page copying and high CPU, memory‑bandwidth, and network overhead, often causing noticeable downtime.

Innovation: CPU‑Integrated QAT Hardware Acceleration

To address the CPU‑bound compression bottleneck, Baidu Cloud partnered with Intel to leverage the QuickAssist Technology (QAT) engine built into Intel® Xeon® 6 Granite Rapids (GNR) processors. The QAT engine provides:

CPU load offload – compression/decompression moves from CPU cores to dedicated hardware.

High parallelism – multiple streams of compression/decompression run concurrently, far faster than software.

Algorithm compatibility – supports mainstream algorithms such as lz4 and zlib.

Full‑process coverage – handles both compression and decompression for the entire migration pipeline.

By offloading memory‑intensive compression to QAT, migration time shortens, CPU peak usage drops, and impact on tenant workloads is reduced.

Solution Comparison

3.1 Pre‑QAT Approach

Before QAT, all memory page processing relied on the host CPU:

Pre‑copy phase: The source host CPU performed dirty‑page detection and software compression (e.g., lz4/zlib).

Data transfer phase: Compressed dirty pages were sent over the network, limited by the compressed size and bandwidth.

Decompression phase: The destination host CPU decompressed the data and wrote the pages into the target VM memory.

The main bottleneck was CPU resource contention: compression/decompression competed with tenant VMs for CPU cycles, especially for large‑memory VMs (>128 GB), extending migration time and increasing the downtime window.

3.2 Post‑QAT Approach

With QAT enabled, the workflow changes:

Pre‑copy phase: The source CPU only detects dirty pages; the QAT user‑mode library (qatzip) submits them to the on‑chip QAT engine for parallel hardware compression, eliminating CPU‑bound compute.

Data transfer phase: QAT‑compressed pages are smaller, improving network efficiency; transmission uses TCP/IP or RDMA.

Decompression phase: The destination host’s QAT hardware performs parallel decompression, writing the raw pages directly into the target VM memory. The final dirty‑page batch is also handled by QAT, while the CPU only controls VM pause/start.

Practical Deployment

During development, Baidu Cloud worked closely with Intel engineers to resolve key challenges for large‑scale cloud deployment:

Ensuring stability when many VMs migrate concurrently without resource contention.

Optimizing migration efficiency for memory‑heavy VMs with uneven write rates.

Integrating QAT acceleration seamlessly into the existing cloud scheduler and migration framework.

The joint effort moved the solution from a demo to production, satisfying stability, high‑concurrency, and operability requirements.

Benefits Observed in Baidu Cloud

Migration performance: For a 64 GB VM, total migration time dropped from 33 s to 12 s (≈60% reduction). Network bandwidth usage also decreased, stabilizing the cluster link.

CPU resource savings: Offloading compression to QAT reduced host CPU utilization by over 20% during migration, lessening impact on co‑located VMs.

Downtime window: Faster dirty‑page convergence shrank the final pause to the “ten‑millisecond” level, meeting stringent SLA requirements of finance and e‑commerce workloads.

High‑load scenario coverage: Large‑memory VMs (e.g., distributed caches, CDN nodes) benefited markedly, enabling rapid migration without service interruption.

Conclusion

Baidu Cloud’s large‑scale adoption of Intel Xeon 6 QAT hardware acceleration demonstrates that offloading memory compression can dramatically improve live‑migration efficiency, lower CPU interference, and meet ultra‑low‑downtime targets. Ongoing collaboration will continue to deepen CPU‑software co‑optimization and boost AI and cloud service performance.

Pre‑QAT migration flow diagram
Pre‑QAT migration flow diagram
Post‑QAT migration flow diagram
Post‑QAT migration flow diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance optimizationCloud ComputingIntel QATCPU offloadVM live migration
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.