Artificial Intelligence 15 min read

Baidu Intelligent Cloud Launches Cloud-native AI 2.0 to Accelerate AI Engineering

Baidu Intelligent Cloud’s new Cloud‑native AI 2.0 platform tackles AI engineering bottlenecks by offering hybrid‑parallel large‑model training, flexible GPU virtualization, and an AI Accelerate Kit that boosts training efficiency over 50 % and cuts inference latency up to 63 %, raising GPU utilization from ~13 % to about 50 %.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
Baidu Intelligent Cloud Launches Cloud-native AI 2.0 to Accelerate AI Engineering

The rapid growth of data and model sizes in AI development has increased demands for faster training, while engineering challenges such as slow training/inference, low resource utilization, and high costs hinder AI adoption.

Gartner predicts that by 2023, 70% of AI applications will be based on container and Serverless technologies, and IDC’s report describes cloud-native AI as a bridge connecting AI applications to IaaS, essential for accelerating AI engineering implementation.

Baidu Intelligent Cloud officially releases Cloud-native AI 2.0, integrating its extensive experience in top-tier AI businesses to address AI engineering bottlenecks.

For large-model pretraining, the solution supports the PaddlePaddle framework’s 4D hybrid parallelism (data, tensor model, parameter slicing, pipeline parallelism) and Microsoft’s DeepSpeed library for PyTorch, plus Mixture-of-Experts (MoE) techniques to scale parameters without proportional compute increase.

GPU virtualization is provided via user-mode and kernel-mode approaches: user-mode offers process fusion, memory swapping, and high-performance features; kernel-mode ensures strong isolation with minimal performance loss, allowing flexible selection based on workload needs.

The AI Accelerate Kit (AIAK) includes AIAK‑Training, a Horovod‑based distributed training framework with communication optimizations that boost classic model efficiency by over 50%; AIAK‑Inference reduces latency of models like ResNet and BERT by 40%‑63% through graph optimization, operator fusion, and a high‑performance operator library; I/O acceleration uses parallel file systems and distributed caching to achieve SSD‑level performance; image acceleration enables on‑demand container image loading, cutting cold‑start time by 6‑20×.

Workflow orchestration combines AI task visualization (supporting PyTorch, TensorFlow, PaddlePaddle operators) with PaddleFlow and KubeFlow integration, offering YAML/DSL/UI definition, snapshot‑based artifact management, complex DAG scheduling (loops, conditionals, etc.), and cross‑platform compatibility with Argo, Airflow, etc., improving experiment reproducibility and reducing learning barriers.

Practice shows these capabilities raise GPU utilization from ~13% to about 50%, deliver linear scaling on thousand‑card clusters, and achieve over 90% cluster utilization for large‑model pretraining, demonstrating significant performance and cost benefits. Cloud-native AI 2.0 is now integrated into Baidu’s Baihe heterogeneous computing platform, accessible via the Baidu Intelligent Cloud website.

cloud-nativeailarge modelsAI accelerationGPU virtualizationworkflow orchestration
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.