Baidu Intelligent Cloud Tech Hub
Author

Baidu Intelligent Cloud Tech Hub

We share the cloud tech topics you care about. Feel free to leave a message and tell us what you'd like to learn.

137
Articles
0
Likes
302
Views
0
Comments
Recent Articles

Latest from Baidu Intelligent Cloud Tech Hub

100 recent articles max
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jun 10, 2026 · Artificial Intelligence

LU‑KV Sets New SOTA at ICML 2026 by Redefining KV Cache Eviction

A joint effort by Baidu Baige and Fudan University introduces the LU‑KV framework, which treats KV‑cache budget allocation as a global combinatorial optimization problem, achieving only 0.52% relative performance loss at 80% compression and establishing a new efficiency‑accuracy SOTA on LongBench.

Cache EvictionICML 2026KV cache
0 likes · 5 min read
LU‑KV Sets New SOTA at ICML 2026 by Redefining KV Cache Eviction
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jun 2, 2026 · Artificial Intelligence

Halving Training Time: LoongForge Full‑Stack Optimizations Boost GR00T N1.6 Throughput 2.3×

LoongForge applies system‑level optimizations—async data prefetch, fine‑grained communication‑compute overlap via a Megatron distributed optimizer, and per‑microbatch CUDA Graph scheduling—to the GR00T N1.6 Vision‑Language‑Action model, delivering up to 2.3× higher training throughput and a 56.6% reduction in overall training time on an 8×A800 cluster.

CUDA GraphDistributed TrainingGR00T N1.6
0 likes · 14 min read
Halving Training Time: LoongForge Full‑Stack Optimizations Boost GR00T N1.6 Throughput 2.3×
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jun 1, 2026 · Cloud Computing

Cut Migration Time by 60%: Baidu Cloud Deploys Intel Xeon 6 QAT‑Accelerated Live VM Migration

The article analyzes the challenges of large‑scale live VM migration, introduces Intel Xeon 6 CPU‑integrated QAT hardware acceleration, compares pre‑ and post‑QAT workflows, and reports a 60% reduction in migration time, 20% CPU savings, and sub‑10 ms downtime in Baidu Smart Cloud production.

Cloud ComputingIntel QATPerformance Optimization
0 likes · 10 min read
Cut Migration Time by 60%: Baidu Cloud Deploys Intel Xeon 6 QAT‑Accelerated Live VM Migration
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 29, 2026 · Industry Insights

How Baidu’s Hanhai U Series Cuts 3 Million Yuan Cost for 10 MW High‑Density AI Data Centers

The article analyzes the power‑supply challenges of high‑density AI data centers, compares traditional UPS and 800 V DC architectures, and shows how Baidu’s Hanhai U series redesign delivers precise capacity matching, up to 2.5× higher power density, 55% space reduction and up to 15% cost savings.

AI computeBaiduHanhai U series
0 likes · 11 min read
How Baidu’s Hanhai U Series Cuts 3 Million Yuan Cost for 10 MW High‑Density AI Data Centers
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 27, 2026 · Artificial Intelligence

Optimizing Large Model Inference Architecture for the Agent Era: Engineering Practices and Challenges

The article analyzes the architectural challenges of large‑model inference in the Agent era—such as memory‑intensive MLA structures, MoE communication overhead, exploding KV‑Cache size, and tool‑call accuracy—and presents a series of engineering solutions including hierarchical KV‑Cache pooling, sequence parallelism, offloading strategies, and chip‑level adaptations to achieve higher throughput and lower token costs.

AI InfraAgentDeepSeek
0 likes · 15 min read
Optimizing Large Model Inference Architecture for the Agent Era: Engineering Practices and Challenges
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 26, 2026 · Operations

When CPUs Hide GPU Bottlenecks: How Btune 2.0 Automates Latency Analysis to Uncover Performance Issues

The article presents a real‑world migration case where a CPU‑XPU bottleneck limited inference QPS, explains how Btune 2.0’s new latency‑focused diagnostics pinpointed a kernel lock contention in the halolet component, and shows the AI Agent’s automated, cross‑process analysis that restored performance and reduced cost.

AI InfrastructureCPU-GPU bottleneckCross-process analysis
0 likes · 11 min read
When CPUs Hide GPU Bottlenecks: How Btune 2.0 Automates Latency Analysis to Uncover Performance Issues
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 22, 2026 · Artificial Intelligence

How Baidu Baige’s Full‑Stack AI Infra Accelerates Embodied Model Iteration

The article details Baidu Baige’s end‑to‑end AI infrastructure for embodied intelligence, covering VLA and world‑model architectures, scaling challenges for medium‑sized models, cloud‑based motion‑control pipelines, open‑source integration, hardware‑aware training optimizations, and simulation‑engine improvements that together speed up model development and deployment.

AI InfraBaidu BaigeEmbodied AI
0 likes · 13 min read
How Baidu Baige’s Full‑Stack AI Infra Accelerates Embodied Model Iteration
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 24, 2026 · Artificial Intelligence

LoongForge: Open‑Source Multimodal Training Framework Runs on GPU and Kunlun XPU with 45% Speedup

LoongForge is an open‑source, Megatron‑based multimodal training framework that unifies LLM, VLM, VLA and diffusion models, runs seamlessly on NVIDIA GPUs and Baidu Kunlun XPU, and delivers 15%‑45% end‑to‑end training acceleration while scaling linearly on thousands of cards.

GPUKunlun XPULoongForge
0 likes · 23 min read
LoongForge: Open‑Source Multimodal Training Framework Runs on GPU and Kunlun XPU with 45% Speedup
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 8, 2026 · Artificial Intelligence

Unlocking 8‑Hour Autonomous Coding: GLM‑5.1’s Leap with Kunlun XPU

The open‑source GLM‑5.1 model, adapted to Baidu Baige's Kunlun XPU via the vLLM‑Kunlun Plugin, delivers record‑breaking SWE‑bench scores, eight‑hour autonomous coding, long‑context handling up to 64K tokens, and scalable deployment across tens of thousands of chips, showcasing end‑to‑end AI acceleration.

GLM-5.1Kunlun XPUModel Deployment
0 likes · 8 min read
Unlocking 8‑Hour Autonomous Coding: GLM‑5.1’s Leap with Kunlun XPU
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 7, 2026 · Artificial Intelligence

How Baidu’s 7th‑Gen AI Confidential VM Achieves Full‑Stack Secure Compute

Baidu Intelligent Cloud’s seventh‑generation AI confidential virtual machine combines Intel TDX, NVIDIA GPUs, and BlueField DPUs to deliver end‑to‑end encrypted data paths, elastic multi‑GPU scaling, and near‑native performance, proving that high‑sensitivity AI workloads can run securely in the cloud without sacrificing speed.

AICloudConfidential Computing
0 likes · 17 min read
How Baidu’s 7th‑Gen AI Confidential VM Achieves Full‑Stack Secure Compute