Artificial Intelligence 9 min read

Comparison of InfiniBand and RoCEv2 Architectures for AI Compute Networks

This article examines the two dominant AI compute network architectures, InfiniBand and RoCEv2, detailing their designs, flow‑control mechanisms, performance, cost and scalability characteristics, and evaluates their respective advantages and limitations to guide network selection for AI data centers.

Architects' Tech Alliance

Sep 12, 2024

Comparison of InfiniBand and RoCEv2 Architectures for AI Compute Networks

When exploring AI compute networks, the market mainly offers two mainstream architectures: InfiniBand and RoCEv2. This article analyzes their technical features, application scenarios in AI compute networks, and their respective strengths and limitations.

1 InfiniBand Network Architecture

InfiniBand networks are centrally managed by a Subnet Manager (SM) deployed on a server. The SM assigns unique Local IDs (LIDs) to each NIC port and switch, maintains routing information, and updates routing tables. NICs contain an SM Agent (SMA) that can handle SM messages autonomously, improving automation and efficiency.

1.1 InfiniBand Flow‑Control Mechanism

InfiniBand uses a credit‑based flow control: each link has a pre‑allocated buffer, and the sender transmits only when the receiver confirms sufficient buffer space, ensuring continuous and loss‑free data transfer.

1.2 InfiniBand Characteristics

The architecture provides link‑level flow control to prevent buffer overflow and employs adaptive routing to dynamically select optimal paths, achieving real‑time resource optimization and load balancing in large‑scale deployments.

2 RoCEv2 Network Architecture

RoCE (RDMA over Converged Ethernet) enables RDMA over Ethernet. RoCEv2 operates at the network layer using UDP, offering better scalability than InfiniBand. Unlike InfiniBand’s centralized management, RoCEv2 follows a distributed architecture, typically consisting of two layers, which enhances deployment flexibility.

2.1 RoCEv2 Flow‑Control Mechanisms

Priority Flow Control (PFC) uses per‑hop buffer thresholds to achieve loss‑free Ethernet transmission. When downstream buffers overload, the switch signals upstream devices to pause transmission until buffers recover. Explicit Congestion Notification (ECN) provides end‑to‑end congestion signals, prompting senders to reduce rates.

Data Center Quantized Congestion Notification (DCQCN) combines ECN and PFC, using ECN to notify senders while avoiding unnecessary PFC activation, thus preventing buffer overflow and maintaining high efficiency.

2.2 RoCEv2 Characteristics

RoCE leverages RDMA to offload data transfer from CPU cycles, reducing latency and increasing throughput. Its strong compatibility with existing Ethernet infrastructure eliminates the need for new hardware, offering cost‑effective performance upgrades for AI compute centers.

3 Technical Differences Between InfiniBand and RoCEv2

InfiniBand excels in high‑performance routing, fast fault recovery, and scalability, making it suitable for large‑scale AI workloads that demand superior throughput. RoCEv2, with its lower cost and broad Ethernet compatibility, provides a flexible and economical solution for diverse network requirements. The article evaluates both architectures to help designers choose the most appropriate network for specific AI compute scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Network Architecture RDMA InfiniBand AI compute RoCEv2

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.