Tagged articles

high‑performance networking

30 articles · Page 1 of 1

Feb 9, 2026 · Backend Development

How Go 1.26’s net.Dialer Gains Context Support Without Sacrificing Performance

Go 1.26 introduces context‑aware net.Dialer methods that combine the zero‑overhead speed of net.DialTCP with timeout, cancellation and tracing capabilities, eliminating DNS resolution and protocol dispatch overhead while providing clear code and a 10‑15% latency reduction for high‑frequency short connections.

ContextGogo1.26

0 likes · 8 min read

How Go 1.26’s net.Dialer Gains Context Support Without Sacrificing Performance

Baidu Intelligent Cloud Tech Hub

Nov 10, 2025 · Cloud Computing

How Polar‑TCP Breaks Kernel Network Bottlenecks for Million‑IOPS Cloud Services

This article explains how traditional kernel network stacks struggle with modern cloud data‑center workloads and introduces Baidu Intelligent Cloud's Polar solution—Polar‑TCP and Polar‑RDMA—which combine user‑space DPDK drivers, a lightweight TCP stack, and an industrial‑grade RPC framework to achieve near‑RDMA performance while preserving ecosystem compatibility.

Cloud ComputingDPDKNetwork Stack

0 likes · 24 min read

How Polar‑TCP Breaks Kernel Network Bottlenecks for Million‑IOPS Cloud Services

Architects' Tech Alliance

Oct 12, 2025 · Artificial Intelligence

How InfiniBand Powers AI Training: Deep Dive into RDMA, RoCEv2, and High‑Speed Interconnects

This article explains how InfiniBand’s architecture, native RDMA, GPUDirect, and evolving bandwidth enable ultra‑low‑latency, high‑throughput communication for AI model training, compares it with Ethernet, and details the role of RoCEv2 and other high‑performance interconnect technologies.

AI trainingGPU interconnectInfiniBand

0 likes · 9 min read

How InfiniBand Powers AI Training: Deep Dive into RDMA, RoCEv2, and High‑Speed Interconnects

Architects' Tech Alliance

Oct 8, 2025 · Artificial Intelligence

What Is UALink? The Open High‑Performance Interconnect Shaping AI Accelerator Clusters

UALink is an open, high‑performance interconnect standard designed to link thousands of AI accelerators, offering NVLink‑level bandwidth, low latency, scalability, cost efficiency, and flexible topologies to meet the demanding communication needs of modern AI workloads.

AI interconnectProtocol StackUALink

0 likes · 8 min read

What Is UALink? The Open High‑Performance Interconnect Shaping AI Accelerator Clusters

Architects' Tech Alliance

May 23, 2025 · Artificial Intelligence

Why High‑Performance Networks Are Critical for Large‑Scale AI Model Training

The whitepaper explains that AI model training and inference rely on massive data computation, with model sizes reaching billions of parameters, demanding low‑latency, high‑bandwidth, stable, scalable, and manageable networks; it compares RDMA‑based InfiniBand and RoCE solutions and offers design recommendations for future AI compute clusters.

AIInfiniBandRDMA

0 likes · 10 min read

Why High‑Performance Networks Are Critical for Large‑Scale AI Model Training

BirdNest Tech Talk

Nov 20, 2024 · Industry Insights

Inside xAI’s 100k‑GPU Colossus: Supermicro Liquid‑Cooled Racks Explained

The article provides a detailed, step‑by‑step tour of xAI’s Colossus supercomputer— a $‑billion AI cluster built in 122 days with 100,000 NVIDIA H100 GPUs—covering Supermicro liquid‑cooled 4U racks, cooling distribution units, power and water infrastructure, storage nodes, CPU servers, 400 GbE networking, and the operational challenges of scaling such a massive system.

AI supercomputingColossusData Center Architecture

0 likes · 16 min read

Inside xAI’s 100k‑GPU Colossus: Supermicro Liquid‑Cooled Racks Explained

Baidu Intelligent Cloud Tech Hub

Sep 29, 2024 · Artificial Intelligence

How Baidu’s Baige 4.0 Redefines AI Infrastructure for Large‑Model Training

The article details Baidu Baige 4.0’s four‑layer AI infrastructure—hardware, cluster components, training‑inference acceleration, and platform tools—highlighting its heterogeneous computing, high‑performance networking, fault‑tolerant communication library, and optimizations that boost large‑model training and inference efficiency.

AI Infrastructureheterogeneous computinghigh‑performance networking

0 likes · 17 min read

How Baidu’s Baige 4.0 Redefines AI Infrastructure for Large‑Model Training

Architects' Tech Alliance

Sep 15, 2024 · Industry Insights

How to Build a Super‑Scale AI Cluster: From GPU Power to DPU‑Driven Architecture

This article analyzes the technical roadmap for upgrading AI super‑large GPU clusters to support trillion‑parameter multimodal models, covering single‑chip performance, super‑node scaling, DPU‑based compute fusion, energy‑efficient designs, converged storage, high‑throughput networking, and fault‑tolerant checkpoint strategies.

AI computeDPUGPU clusters

0 likes · 18 min read

How to Build a Super‑Scale AI Cluster: From GPU Power to DPU‑Driven Architecture

Architects' Tech Alliance

Sep 8, 2024 · Artificial Intelligence

Design and Architecture of Multi‑Million GPU Clusters for Large‑Scale AI Model Training

The article surveys the network architectures and congestion‑control techniques used in massive GPU clusters—such as Byte’s megascale, Baidu HPN, Alibaba HPN7, and Tencent Xingmai 2.0—highlighting how high‑bandwidth, low‑latency designs and advanced RDMA technologies enable training of trillion‑parameter multimodal AI models.

Data CenterGPU clustersHPN

0 likes · 11 min read

Design and Architecture of Multi‑Million GPU Clusters for Large‑Scale AI Model Training

Architects' Tech Alliance

May 23, 2024 · Cloud Computing

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

This article analyzes traditional cloud data center network limitations for AI workloads and compares various high‑bandwidth, low‑latency architectures—including two‑layer and three‑layer fat‑tree designs, InfiniBand, and RoCE—providing best‑practice recommendations for building scalable, non‑blocking AI‑Pool networks.

AI computingFat-TreeGPU clusters

0 likes · 12 min read

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

ByteDance SYS Tech

Apr 26, 2024 · Backend Development

How io_uring Integration Boosts Netpoll Throughput and Slashes Latency

This article examines the integration of Linux io_uring into ByteDance's high‑performance Netpoll NIO library, detailing architectural changes, receive/send workflows, benchmarking methodology, and results that show over 10% higher throughput and 20‑40% lower latency while eliminating system calls.

Gobenchmarkhigh‑performance networking

0 likes · 18 min read

How io_uring Integration Boosts Netpoll Throughput and Slashes Latency

360 Smart Cloud

Apr 25, 2024 · Cloud Native

Building High‑Performance RoCE v2 and InfiniBand Networks in a Cloud‑Native Environment for Large‑Model Training

This article explains how to construct high‑performance RoCE v2 and InfiniBand networks within a cloud‑native Kubernetes environment, detailing the underlying technologies, required components, configuration steps, and performance test results that demonstrate significant communication speed improvements for large‑scale AI model training.

AI trainingCloud NativeInfiniBand

0 likes · 12 min read

Building High‑Performance RoCE v2 and InfiniBand Networks in a Cloud‑Native Environment for Large‑Model Training

Architects' Tech Alliance

Apr 3, 2024 · Industry Insights

InfiniBand vs. RoCE v2: Choosing the Best Network for AI Data Centers

This article provides a detailed technical comparison between InfiniBand and RoCE v2, covering architecture, lossless transmission, adaptive routing, major vendors, performance, scalability, operational complexity, and cost considerations to help AI data center architects select the most suitable high‑performance network solution.

AI data centerInfiniBandNetwork Comparison

0 likes · 13 min read

InfiniBand vs. RoCE v2: Choosing the Best Network for AI Data Centers

Linux Code Review Hub

Feb 20, 2024 · Fundamentals

Why TCP Needs a Rethink: RDMA Insights and 800 Gbps Experiments

The talk examines the challenges of using standard Linux TCP for high‑performance data‑center workloads, explores how RDMA can provide zero‑copy and asynchronous kernel bypass, and presents experimental results from an FPGA‑based prototype that approaches 800 Gbps packet rates while highlighting congestion‑control and CPU‑utilization trade‑offs.

Congestion ControlFPGAKernel Bypass

0 likes · 23 min read

Why TCP Needs a Rethink: RDMA Insights and 800 Gbps Experiments

NetEase LeiHuo UX Big Data Technology

Jan 17, 2024 · Backend Development

Understanding DPDK: Background, Architecture, High‑Performance Techniques, and Real‑World Applications

This article explains the origins of DPDK, describes its modular architecture and performance‑enhancing mechanisms such as UIO, hugepages, and CPU affinity, and reviews popular user‑space networking frameworks like F‑Stack and Seastar that leverage DPDK for high‑throughput cloud services.

DPDKF-StackUser-space I/O

0 likes · 9 min read

Understanding DPDK: Background, Architecture, High‑Performance Techniques, and Real‑World Applications

Alibaba Cloud Infrastructure

Nov 11, 2023 · Cloud Computing

Alibaba Cloud Executive Discusses IPv6 Deployment, Global Collaboration, and AI‑Driven Network Evolution at the 2023 Wuzhen Internet Forum

In a detailed interview at the 2023 Wuzhen Internet Forum, Alibaba Cloud’s infrastructure lead Cai Dezhong outlines the three‑phase IPv6 rollout, highlights organizational and technical innovations, stresses the need for global cooperation, and explains how IPv6 underpins the next generation AI infrastructure and predictable high‑performance networking.

AI InfrastructureGlobal CollaborationIPv6

0 likes · 9 min read

Alibaba Cloud Executive Discusses IPv6 Deployment, Global Collaboration, and AI‑Driven Network Evolution at the 2023 Wuzhen Internet Forum

Alibaba Cloud Infrastructure

Sep 19, 2023 · Cloud Computing

AI‑Era Cloud Infrastructure: High Compute Density, Linear Scalability and Intelligent Operations – Highlights from the 2023 Open Data Center Conference

The 2023 Open Data Center Conference in Beijing showcased Alibaba Cloud's AI‑era infrastructure innovations—including high‑density compute clusters, predictable high‑performance networking, intelligent power‑simulation systems, battery diagnostics, liquid‑cooling solutions, and modular server standards—demonstrating how cloud platforms are being rebuilt to meet the demands of large AI models and sustainable operation.

AIIntelligent OperationsModular Servers

0 likes · 10 min read

AI‑Era Cloud Infrastructure: High Compute Density, Linear Scalability and Intelligent Operations – Highlights from the 2023 Open Data Center Conference

Architects' Tech Alliance

Aug 10, 2023 · Industry Insights

InfiniBand vs RoCEv2: Which Network Powers AI Model Training?

This article examines the architecture of AI compute clusters, explaining offline training and inference pipelines, the role of RDMA, and the technical differences between InfiniBand and RoCEv2—including latency, bandwidth, scalability, cost, and vendor considerations—to help engineers choose the optimal high‑performance network for large‑model training.

AI computeInfiniBandRDMA

0 likes · 13 min read

InfiniBand vs RoCEv2: Which Network Powers AI Model Training?

Baidu Intelligent Cloud Tech Hub

Jun 21, 2023 · Artificial Intelligence

How Baidu’s AIPod Network Powers Massive AI Model Training

This article explains the design and engineering of Baidu's AIPod high‑performance network, detailing the massive bandwidth, scalability, stability, and low‑latency requirements of large‑scale AI model training and the practical tools used to monitor and troubleshoot such workloads.

AIAIPoddistributed training

0 likes · 22 min read

How Baidu’s AIPod Network Powers Massive AI Model Training

Alibaba Cloud Big Data AI Platform

Jun 19, 2023 · Cloud Computing

Predictable Network: Alibaba Cloud’s Ethernet Edge for Faster AI Training

This article examines the challenges of scaling AI model training beyond single-chip limits, introduces Alibaba Cloud’s Predictable Network architecture—including high‑performance Ethernet, dual‑uplink, and adaptive routing—and compares its performance, scalability, and reliability against InfiniBand, showing how Ethernet can meet AI workloads with minimal loss.

AI trainingEthernet vs InfiniBandPredictable Network

0 likes · 27 min read

Predictable Network: Alibaba Cloud’s Ethernet Edge for Faster AI Training

Tencent Cloud Developer

Dec 20, 2022 · Cloud Computing

HARP – Tencent Cloud's High‑Performance, Highly Available Network Transmission Protocol

HARP is Tencent Cloud's high-performance, highly available network transmission protocol that quickly reroutes around switch failures within 100 µs, offering zero packet loss, low latency, high bandwidth, scalable connections, and custom congestion control for storage, HPC, AI, and big data workloads.

Congestion ControlData CenterHARP

0 likes · 15 min read

HARP – Tencent Cloud's High‑Performance, Highly Available Network Transmission Protocol

Tencent Cloud Developer

Jun 6, 2022 · Cloud Computing

High‑Performance Network Solutions: RDMA, RoCE, iWARP and io_uring – Principles, Implementation and Benchmark Analysis

The article reviews high‑performance networking options—RDMA (including RoCE v2 and iWARP) and Linux’s io_uring—explaining their principles, hardware requirements, and benchmark results, and concludes that while RDMA delivers ultra‑low latency for specialized workloads, io_uring offers modest network benefits, leaving TCP as the default for most services.

RDMAbenchmarkhigh‑performance networking

0 likes · 10 min read

High‑Performance Network Solutions: RDMA, RoCE, iWARP and io_uring – Principles, Implementation and Benchmark Analysis

Architects' Tech Alliance

May 19, 2022 · Fundamentals

An Introduction to RDMA: Concepts, Advantages, Protocols, and Programming Basics

This article explains the fundamentals of Remote Direct Memory Access (RDMA), comparing it with traditional networking, outlining its core advantages, suitable use cases, the three main RDMA protocols (Infiniband, RoCE, iWARP), deployment requirements, communication flow, and essential programming concepts.

RDMARoCEhigh‑performance networking

0 likes · 9 min read

An Introduction to RDMA: Concepts, Advantages, Protocols, and Programming Basics

Architects' Tech Alliance

Mar 7, 2021 · Fundamentals

Understanding RDMA: InfiniBand, iWARP, and RoCE Technologies and Their Differences

This article explains Remote Direct Memory Access (RDMA), its origins in InfiniBand, the Ethernet‑based variants iWARP and RoCE (including RoCEv1 and RoCEv2), compares their architectures, performance characteristics, and deployment requirements for high‑performance computing and data‑center networks.

InfiniBandRDMARoCE

0 likes · 11 min read

Understanding RDMA: InfiniBand, iWARP, and RoCE Technologies and Their Differences

Architects' Tech Alliance

Nov 11, 2020 · Fundamentals

Understanding DPDK Memory Management: Large Pages, NUMA, DMA, and IOMMU

This article explains the core principles of DPDK memory management, covering standard huge pages, NUMA node binding, direct memory access, IOMMU and IOVA addressing, custom allocators, and memory pools, and how these mechanisms together enable high‑performance packet processing on Linux systems.

DMADPDKHuge Pages

0 likes · 14 min read

Understanding DPDK Memory Management: Large Pages, NUMA, DMA, and IOMMU

Alibaba Cloud Infrastructure

Oct 9, 2019 · Cloud Computing

The Next Decade of Cloud Networking: Highlights from Alibaba Cloud Network Forum at Yunqi Conference 2019

The 2019 Yunqi Conference Cloud Network Forum gathered over two hundred network enthusiasts to review a decade of Alibaba data‑center networking evolution, explore emerging technologies such as AI, big data, and programmable chips, and outline the next ten years of high‑performance, data‑centric cloud networking.

Big DataNetwork Architecturehigh‑performance networking

0 likes · 9 min read

The Next Decade of Cloud Networking: Highlights from Alibaba Cloud Network Forum at Yunqi Conference 2019

Architects' Tech Alliance

Jul 5, 2019 · Backend Development

A Comprehensive Overview of DPDK and SPDK Technologies

This article provides an in‑depth technical overview of DPDK and SPDK, covering their background, the evolution of network I/O, Linux bottlenecks, user‑space I/O via UIO, poll‑mode drivers, performance‑optimizing techniques such as huge pages, SIMD, cache management, and the surrounding ecosystem and adoption.

DPDKSPDKUser-space I/O

0 likes · 15 min read

A Comprehensive Overview of DPDK and SPDK Technologies

Architects' Tech Alliance

Apr 24, 2019 · Industry Insights

Why RDMA Is the Game‑Changer for High‑Performance Networking in AI Workloads

This article examines the rise of RDMA high‑performance networking, explains its technical advantages over traditional TCP/IP, showcases real‑world deployments in machine learning at Baidu, and explores future use cases in storage, GPU communication, and core services.

GPU DirectInfiniBandRDMA

0 likes · 9 min read

Why RDMA Is the Game‑Changer for High‑Performance Networking in AI Workloads

Architects' Tech Alliance

Apr 8, 2019 · Fundamentals

Understanding RDMA: Principles, Advantages, and Implementation Details

This article explains how RDMA (Remote Direct Memory Access) technology, originating from InfiniBand and extended to Ethernet (RoCE) and TCP/IP (iWARP), provides ultra‑low latency, high throughput, and minimal CPU usage for high‑performance computing and big‑data applications by bypassing traditional OS and protocol stack processing.

RDMARoCEhigh‑performance networking

0 likes · 8 min read

Understanding RDMA: Principles, Advantages, and Implementation Details

Architects' Tech Alliance

Dec 4, 2018 · Fundamentals

Understanding RDMA High‑Performance Networks: Principles, Benefits, and Applications in Machine Learning

The article explains the background, architecture, and performance advantages of RDMA high‑performance networking, compares it with traditional TCP/IP, describes its deployment at Baidu for machine‑learning workloads, and outlines future use cases such as storage acceleration, GPU communication, and core services.

RDMARoCEhigh‑performance networking

0 likes · 12 min read

Understanding RDMA High‑Performance Networks: Principles, Benefits, and Applications in Machine Learning