Author

Baidu Intelligent Cloud Tech Hub

We share the cloud tech topics you care about. Feel free to leave a message and tell us what you'd like to learn.

130

Articles

Likes

Views

Comments

Latest from Baidu Intelligent Cloud Tech Hub

100 recent articles max

Baidu Intelligent Cloud Tech Hub

Nov 20, 2025 · Artificial Intelligence

Boost Multimodal Model Training Efficiency with Offline Sequence Packing and Mixed‑Modality Data

Baidu's Baige team introduces an extended multimodal data loader, automated ShareGPT format conversion, and offline sequence packing techniques that together double token throughput, cut SFT training time by up to six times, and improve GPU utilization and stability for large vision‑language models.

AI infrastructureAIAKGPU efficiency

0 likes · 7 min read

Boost Multimodal Model Training Efficiency with Offline Sequence Packing and Mixed‑Modality Data

Baidu Intelligent Cloud Tech Hub

Nov 19, 2025 · Artificial Intelligence

Boost LLM Inference Speed with Token‑Level Two‑Chunk Overlap

Token‑level Two‑Chunk Overlap replaces traditional batch‑level Two‑Batch Overlap, dynamically splitting sequences into balanced token chunks, enabling near‑equal compute and communication times, improving GPU utilization and achieving up to 30% throughput gains in heterogeneous request workloads, with zero accuracy loss.

Batch schedulingGPU utilizationLLM inference

0 likes · 9 min read

Boost LLM Inference Speed with Token‑Level Two‑Chunk Overlap

Baidu Intelligent Cloud Tech Hub

Nov 10, 2025 · Cloud Computing

How Polar‑TCP Breaks Kernel Network Bottlenecks for Million‑IOPS Cloud Services

This article explains how traditional kernel network stacks struggle with modern cloud data‑center workloads and introduces Baidu Intelligent Cloud's Polar solution—Polar‑TCP and Polar‑RDMA—which combine user‑space DPDK drivers, a lightweight TCP stack, and an industrial‑grade RPC framework to achieve near‑RDMA performance while preserving ecosystem compatibility.

DPDKHigh‑Performance NetworkingNetwork Stack

0 likes · 24 min read

How Polar‑TCP Breaks Kernel Network Bottlenecks for Million‑IOPS Cloud Services

Baidu Intelligent Cloud Tech Hub

Nov 7, 2025 · Artificial Intelligence

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

In a deep interview, Baidu AI Computing chief scientist Wang Yanpeng and host Koji trace China's internet infrastructure from the early big‑data era through cloud computing to today's AI boom, highlighting the pivotal role of compute power, GPU acceleration, data scaling, and Baidu's Baige platform in shaping the AI arms race.

AI infrastructureBaidu BaigeGPU computing

0 likes · 26 min read

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

Baidu Intelligent Cloud Tech Hub

Nov 4, 2025 · Artificial Intelligence

How Baidu’s Baige Accelerates Multimodal Video Training with Context Parallelism

Baidu Baige’s enhanced veRL framework dramatically boosts video frame rates and resolution limits, cuts training time, reduces memory usage, and improves model accuracy by leveraging context parallelism and optimized attention on Ampere GPUs for multimodal mixed‑training scenarios.

AI accelerationContext ParallelismMultimodal Training

0 likes · 6 min read

How Baidu’s Baige Accelerates Multimodal Video Training with Context Parallelism

Baidu Intelligent Cloud Tech Hub

Oct 29, 2025 · Operations

How to Prevent Avalanche Failures in Large‑Scale Microservice Systems

This article explains how Baidu's SRE team identified the root causes of avalanche failures in massive microservice architectures, modeled system limits with Little’s Law, and implemented engineering practices such as retry budgets, queue throttling, and global TTL controls to achieve self‑healing and eliminate avalanche incidents.

SREavalanche failuremicroservices

0 likes · 9 min read

How to Prevent Avalanche Failures in Large‑Scale Microservice Systems

Baidu Intelligent Cloud Tech Hub

Oct 28, 2025 · Artificial Intelligence

How Baidu’s New MTP Inference Code Doubles DeepSeek‑V3.2 Throughput

Baidu Baige and the SGLang community have open‑sourced a production‑tested MTP inference engine that boosts DeepSeek‑V3.2 decoding speed by over two times while delivering exceptional stability, thanks to a DSA‑optimized architecture that predicts multiple tokens in a single forward pass.

AIDSADeepSeek

0 likes · 4 min read

How Baidu’s New MTP Inference Code Doubles DeepSeek‑V3.2 Throughput

Baidu Intelligent Cloud Tech Hub

Sep 22, 2025 · Cloud Computing

How Mantle Breaks the Hierarchical Namespace Bottleneck in Cloud Object Storage

The Mantle system, presented in a SOSP'25 paper by Baidu's storage team and collaborators, delivers a distributed hierarchical namespace for cloud object storage that overcomes traditional scalability and performance limits, enabling massive data lake workloads with dramatically reduced latency and vastly increased throughput.

SOSPcloud storagedistributed-systems

0 likes · 8 min read

How Mantle Breaks the Hierarchical Namespace Bottleneck in Cloud Object Storage

Baidu Intelligent Cloud Tech Hub

Sep 9, 2025 · Artificial Intelligence

How Baidu Built a 32,000‑Card AI Super‑Compute Cluster and Boosted Efficiency by 50%

This article details Baidu Intelligent Cloud's journey in designing, constructing, and operating a 32,000‑card hybrid AI compute cluster, covering challenges in power, cooling, networking, multi‑cluster scheduling, and security, and explains how innovative hardware, software, and operational strategies achieved over 50% MFU improvement and industry‑first performance records.

AI infrastructureGPU clustersHybrid Cloud

0 likes · 15 min read

How Baidu Built a 32,000‑Card AI Super‑Compute Cluster and Boosted Efficiency by 50%

Baidu Intelligent Cloud Tech Hub

Sep 4, 2025 · Artificial Intelligence

Unlocking MoE Model Power: Baidu’s Baige 5.0 AI Platform’s FP8 and Distributed Innovations

Baidu’s Baige 5.0 AI Computing Platform introduces FP8 mixed‑precision training, MoE‑aware distributed strategies, adaptive parallelism, and a three‑tier KV‑Cache, delivering over 30% training speedup and 50% inference throughput gains while keeping token latency under half a second for large‑scale models.

AIFP8Inference

0 likes · 16 min read

Unlocking MoE Model Power: Baidu’s Baige 5.0 AI Platform’s FP8 and Distributed Innovations