Operations 11 min read

What Is InfiniBand RDMA and How to Configure It on RHEL 8?

This guide explains the fundamentals of InfiniBand and RDMA, details the InfiniBand Verbs API, outlines the steps required for kernel data handling, and provides practical configuration instructions for RoCE, IPoIB, and the subnet manager on Red Hat Enterprise Linux 8.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
What Is InfiniBand RDMA and How to Configure It on RHEL 8?

Overview

InfiniBand is a high‑performance network technology that enables Remote Direct Memory Access (RDMA), allowing one host to read or write another host’s memory without involving the CPU, which reduces latency and CPU usage.

Key Components

InfiniBand physical link protocol – defines the low‑level wire protocol.

InfiniBand Verbs API – the programming interface that implements RDMA operations.

How RDMA Works

When a user‑space program sends data to a remote host, the kernel must:

Receive the incoming data.

Determine whether the data belongs to the requesting application.

Wake the appropriate user‑space process.

Wait for the process to consume the data.

Copy the kernel’s internal buffers into the user‑space buffers provided by the application.

If the host adapter uses DMA, most traffic is copied into system memory, and the kernel performs many context switches, which can increase CPU load.

RDMA Communication Model

RDMA bypasses the kernel for data transfer, placing packets directly into the application’s memory. For InfiniBand, the host adapter does not forward packets to the kernel; instead, it writes them directly into the user buffer.

RHEL 8 Support

Red Hat Enterprise Linux 8 supports InfiniBand hardware and the InfiniBand Verbs API, as well as the following technologies for non‑InfiniBand hardware:

iWARP (RDMA over TCP/IP)

RoCE (RDMA over Converged Ethernet, also called InfiniBand over Ethernet)

RoCE Versions

RoCE v1 uses Ethernet ethertype 0x8915 and allows communication between two hosts in the same broadcast domain.

RoCE v2 runs over IPv4/IPv6 UDP, uses port 4791, and is supported by Mellanox ConnectX‑3 Pro, ConnectX‑4 Lx, and ConnectX‑5 adapters. The client must use RoCE v2 while the server may use either RoCE v1 or RoCE v2.

RDMA Connection Manager (RDMA_CM)

RDMA_CM provides a reliable connection‑oriented interface for data transfer, handling message‑based communication via RDMA devices.

IP over InfiniBand (IPoIB)

IPoIB creates an IP network layer on top of InfiniBand. It can operate in two modes:

Datagram mode – unreliable, connection‑less, limited by the InfiniBand link‑layer MTU (e.g., 2044 bytes).

Connected mode – reliable, connection‑oriented, supports larger MTU up to 65520 bytes, but still subject to IP/TCP header limits.

When the system is configured for Connected mode, multicast traffic is still sent in Datagram mode because InfiniBand switches cannot forward multicast in Connected mode.

Kernel Memory Considerations

RDMA requires pinned physical memory; the kernel cannot swap this memory. Over‑pinning can exhaust system RAM, causing the kernel to terminate the RDMA application. Root users may need to increase the amount of pinned memory for large RDMA workloads.

Subnet Manager Configuration

All InfiniBand fabrics need a subnet manager (SM) to function. If the primary SM fails, a secondary SM takes over. Red Hat provides the OpenSM subnet manager for newer deployments.

IPoIB Device Naming

By default, InfiniBand devices appear as ib0, ib1, etc. To avoid naming conflicts, create persistent udev rules (e.g., naming a device mlx4_ib0).

Practical Configuration Steps

1. Install the rdma service package; systemd will start it when InfiniBand, iWARP, or RoCE hardware is detected.

2. Choose the appropriate RoCE version based on your hardware and configure the client and server accordingly.

3. Set the IPoIB MTU according to the chosen mode (Datagram: 2044, Connected: up to 65520).

4. Ensure sufficient pinned memory is allocated for your RDMA applications.

5. Verify that a subnet manager (e.g., OpenSM) is running and that the InfiniBand devices have unique persistent names.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

RDMANetwork ConfigurationInfiniBandRoCERHEL8IPoIBsubnet manager
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.