Fundamentals 10 min read

Understanding InfiniBand: Architecture, Protocols, and Performance

InfiniBand is a high‑performance network protocol that uses credit‑based flow control and switched fabric architecture to provide low latency, high bandwidth, and reliable data transfer, offering advantages over TCP/IP such as reduced packet loss, efficient RDMA, and support for various upper‑layer protocols.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Understanding InfiniBand: Architecture, Protocols, and Performance

Traditional IP networks such as TCP/IP suffer from packet loss and retransmission delays, which degrade performance, whereas InfiniBand (IB) employs a trust‑based, credit‑based flow‑control mechanism that virtually eliminates packet loss.

IB only transmits data when the receiver’s buffer has sufficient space; after receiving data, the receiver signals buffer availability, removing the need for retransmissions and improving overall efficiency.

What is InfiniBand Network

InfiniBand is a network communication protocol built on a switched architecture, forming point‑to‑point bidirectional serial links between processor nodes and I/O nodes (e.g., disks or storage). Each link terminates in a device that controls transmission and reception at both ends.

InfiniBand creates private, protected channels between nodes via switches, enabling remote direct memory access (RDMA) without CPU involvement; the InfiniBand adapter manages load transmission.

The adapter connects to the CPU via PCI Express on one side and to the InfiniBand subnet on the other, offering higher bandwidth, lower latency, and better scalability compared to traditional protocols.

What is InfiniBand Architecture

The InfiniBand Architecture (IBA) is hardware‑oriented, whereas TCP is software‑oriented; consequently, IB is a lighter transport service that does not require packet reordering because the link layer already delivers ordered packets.

Because IB uses credit‑based flow control, the transport layer does not need TCP‑like window algorithms to determine the optimal number of in‑flight packets, allowing delivery of 56 Gb/s or 100 Gb/s data rates with minimal latency and negligible CPU usage.

IB is a channel‑based, bidirectional serial transmission that employs a switched‑fabric topology; switches (IBA switches) and repeaters extend the fabric when needed.

Each IB subnet can contain up to 65,536 nodes. To span multiple subnets, routers or gateways are required.

Nodes connect to the subnet via adapters: Host Channel Adapter (HCA) for CPUs/memory, Target Channel Adapter (TCA) for storage/I/O, and the overall connection is called a link.

InfiniBand Rate Development Introduction

InfiniBand serial links operate at various signaling rates; multiple links can be aggregated (e.g., 4X) to achieve higher throughput. Signaling rate and encoding scheme determine the effective data rate, with encoding adding overhead (e.g., 10 bits transmitted for every 8 bits of data).

Typical implementations bundle four link lanes; current InfiniBand systems provide the following throughput rates:

InfiniBand Upper‑Layer Protocols

InfiniBand supports several upper‑layer protocols and defines messages for management functions, including SDP, SRP, iSER, RDS, IPoIB, and uDAPL.

SDP (Sockets Direct Protocol) : Defined by the InfiniBand Trade Association, it allows existing TCP/IP applications to run over high‑speed InfiniBand.

SRP (SCSI RDMA Protocol) : Packages SCSI commands for RDMA transmission over InfiniBand, enabling storage sharing and RDMA communication.

iSER (iSCSI Extensions for RDMA) : Transports iSCSI commands and data over RDMA, standardised by IETF for IB SAN environments.

RDS (Reliable Datagram Sockets) : Similar to UDP, designed for socket‑based data transfer over InfiniBand, developed by Oracle.

IPoIB (IP over InfiniBand) : Provides compatibility between InfiniBand and TCP/IP networks, allowing unmodified TCP/IP applications to use InfiniBand bandwidth.

uDAPL (User Direct Access Programming Library) : A standard API that leverages RDMA to improve data‑center application performance, scalability, and reliability.

Additional protocols such as iSER, NFSoRDMA, and SRP also enable SCSI command packaging and storage sharing via RDMA.

InfiniBand Management Software

OpenSM is an InfiniBand subnet manager that runs on the Mellanox OFED stack, handling in‑band management of the fabric, including device discovery, monitoring, and health analysis.

OpenSM comprises a subnet manager, backbone manager, and performance manager, offering comprehensive management capabilities such as automatic device discovery, fabric visualization, intelligent analysis, and health monitoring.

For more detailed InfiniBand technical information, refer to the e‑book “InfiniBand Architecture and Technical Practice Summary”.

Recommended Reading:

InfiniBand Architecture and Technical Practice Summary

Detailed Guide to Learning Big Data from Scratch

Warm Tip:

Please search for “ICT_Architect” or scan the QR code to follow the public account and click the original link for more technical articles.

Seek knowledge with hunger, remain humble with curiosity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

network architectureHigh‑performance computingRDMAInfiniBandSwitched Fabric
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.