Fundamentals 12 min read

Understanding OpenFabrics Enterprise Distribution (OFED) and the InfiniBand Software Architecture

This article explains the OpenFabrics Enterprise Distribution (OFED) ecosystem, its history, the InfiniBand hardware and software stack, key protocols such as IPoIB, SDP and iSER, and how these technologies enable high‑performance, low‑latency networking across Linux, Windows and virtualized environments.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Understanding OpenFabrics Enterprise Distribution (OFED) and the InfiniBand Software Architecture

OpenFabrics Enterprise Distribution (OFED) is a collection of open‑source drivers, core kernel code, middleware and user‑level interfaces that support InfiniBand fabrics.

First released in 2005 by the OpenFabrics Alliance (OFA), Mellanox OFED provides Linux and Windows (WinOF) drivers together with diagnostic and performance tools for monitoring bandwidth and congestion within InfiniBand networks.

The OpenFabrics Alliance, founded in June 2004 (originally as the OpenIB Alliance), develops, tests and supports the OFED stack, aiming to deliver high‑efficiency messaging, low latency and maximum bandwidth with minimal CPU overhead.

The alliance expanded its charter in 2006 to include iWARP support, added RoCE (RDMA over Converged Ethernet) in 2010, and later broadened support to other high‑performance networks through the OpenFabrics Interfaces working group.

Mellanox OFED is a unified software stack that includes drivers, middleware, user interfaces and standard protocols such as IPoIB, SDP, SRP, iSER, RDS, and DAPL, supporting MPI, Lustre/NFS over RDMA and exposing the Verbs programming interface.

If the logical diagram of the software stack looks complex, refer to the simplified illustration shown above (MLNX_OFED_LINUX – Mellanox OFED for Linux) which is distributed as an ISO image containing source code, binary RPMs, firmware, utilities, installation scripts and documentation.

InfiniBand serial links can operate at various signaling rates; multiple links can be aggregated (e.g., 4X) to achieve higher throughput. The raw signaling rate is coupled with encoding schemes, which add overhead (e.g., 8 bits of data transmitted as 10 bits) to reduce error rates.

Typical implementations aggregate four link lanes, and current InfiniBand systems offer a range of throughput rates (see the accompanying table).

From the perspective of an application architect or developer, the following sections analyse and interpret the InfiniBand architecture and service capabilities.

InfiniBand Software Architecture

InfiniBand’s software stack is designed to simplify application deployment. IP and TCP socket applications can leverage InfiniBand performance without any changes to existing Ethernet‑based code, and the same applies to SCSI, iSCSI and file‑system applications.

The stack sits above the low‑level HCA driver and the device‑independent API (verbs). Upper‑layer protocols expose industry‑standard interfaces, allowing seamless deployment of existing applications.

The Linux InfiniBand architecture consists of a set of kernel modules and protocols, plus user‑mode shared libraries (not shown). User‑space applications remain transparent to the underlying interconnect technology.

The kernel code is logically divided into three layers: the HCA driver, the core InfiniBand module, and the upper‑layer protocols.

Middle‑layer main functions

Communication Manager (CM) – provides services required for establishing connections.

SA client – enables communication with the Subnet Administrator (SA), which supplies essential connection information such as path records.

SMA – Subnet Management Agent that processes management packets for configuring devices.

PMA – Performance Management Agent that retrieves hardware performance counters.

MAD service – Management Datagram service offering interfaces to special InfiniBand Queue Pairs (QP0 and QP1).

GSI – General Services Interface allowing management packets on QP1.

Queue Pair (QP) – redirects high‑level management protocols to dedicated QPs for bandwidth‑intensive operations.

SMI – Subnet Management Interface for sending/receiving packets on QP0.

Verbs – exposes the verb API provided by the HCA driver; the InfiniBand specification defines the required verb semantics, which the middle layer maps to Linux kernel APIs.

The middle layer also tracks resource allocation, reference counting and cleanup after abnormal termination or client shutdown.

The lowest layer consists of HCA drivers; each HCA requires a vendor‑specific driver that registers with the middle layer and provides InfiniBand verbs.

Higher‑level protocols such as IPoIB, SRP, SDP, iSER, etc., run over standard data‑network, storage and file‑system stacks. Apart from IPoIB’s simple encapsulation of TCP/IP over InfiniBand, these protocols transparently deliver higher bandwidth, lower latency, reduced CPU utilization and end‑to‑end services using RDMA.

IB Support for IP‑Based Applications

The simplest way to evaluate any IP‑based application on InfiniBand is to use IP over IB (IPoIB). Running IPoIB on a high‑bandwidth InfiniBand adapter instantly improves performance for any IP application, as it tunnels IP packets over InfiniBand.

In Linux, the protocol is implemented as a standard network driver, allowing any application or kernel driver that uses the Linux networking stack to operate over InfiniBand without modification. Linux kernel 2.6.11 and later support IPoIB, along with the core InfiniBand layer and Mellanox HCA drivers.

This approach is effective for management, configuration or control‑plane traffic where bandwidth and latency are not critical. For full performance and to exploit advanced InfiniBand features, developers can also use the Socket Direct Protocol (SDP) and related socket APIs.

InfiniBand also supports Socket, SCSI, iSCSI and NFS applications. For example, the iSER protocol inserts a SCSI‑style layer into Linux, operating over the Connection Manager Abstraction (CMA) to provide transparent RDMA access over InfiniBand and iWARP.

Thus, user‑space applications using the libc interface and kernel‑space applications using Linux file‑system interfaces can operate transparently without awareness of the underlying interconnect.

For detailed technical information, refer to the e‑book “InfiniBand Architecture and Technology Practical Summary”.

Today, InfiniBand software and protocols are supported on major Linux and Windows distributions as well as hypervisor platforms, including Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Microsoft Windows Server, Windows CCS, and VMware virtual infrastructure.

Friendly reminder: Scan the QR code to follow the public account and click the original link for more resources on cloud computing, micro‑services, architecture and related technologies.

High Performance ComputingLinuxnetworkingRDMAInfiniBandOFED
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.