How Baidu Cloud Achieved 4µs Low-Latency PD Inference with HPN Network Optimizations

To meet the demanding network requirements of large‑scale PD‑separated inference, Baidu Cloud built a 4 µs end‑to‑end low‑latency HPN cluster, optimized traffic management, adaptive routing, and custom Alltoall operators, resulting in up to 20 % throughput gains and reduced latency for both Prefill and Decode stages.

AI inferenceAlltoall optimizationDistributed Training

0 likes · 14 min read

How Baidu Cloud Achieved 4µs Low-Latency PD Inference with HPN Network Optimizations

Network Intelligence Research Center (NIRC)

Oct 11, 2023 · Cloud Native

How XDP Is Redefining Network Performance Beyond Traditional Stacks

This article examines XDP (eXpress Data Path), a Linux kernel eBPF‑based technology that pushes packet processing to the earliest point in the network interface, delivering ultra‑low latency, enhanced security, and flexible custom processing for high‑performance routing, DDoS mitigation, and cloud environments.

Cloud NativeLinux kernelNetwork Security

0 likes · 5 min read

How XDP Is Redefining Network Performance Beyond Traditional Stacks