Fundamentals 15 min read

Understanding RDMA: How Direct Memory Access Boosts Data Center Performance

This article explains the principles of DMA and RDMA, compares RDMA protocols such as InfiniBand, RoCE, and iWARP, outlines their performance advantages, and reviews the key standards bodies, open‑source communities, hardware vendors, and real‑world adoption in high‑performance data centers.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Understanding RDMA: How Direct Memory Access Boosts Data Center Performance

Direct Memory Access (DMA)

DMA is a hardware mechanism that lets an I/O device transfer data directly to or from main memory without CPU involvement. Without DMA, a network card must request the CPU to copy data from a user buffer to a kernel buffer and then to the device, consuming CPU cycles. With a DMA controller, the device issues a read or write command on the bus, the controller moves the data between memory and the device registers, and the CPU only initiates and completes the transfer.

Remote Direct Memory Access (RDMA)

RDMA extends the DMA concept across network nodes. A local NIC can read from or write to a memory region that has been registered on a remote host, bypassing the TCP/IP stack and most kernel processing. The data path is:

Application registers a memory region and obtains a remote key.

RDMA NIC packages the data, adds minimal protocol headers, and sends it over the network.

The remote NIC strips the headers and uses DMA to place the payload directly into the registered user‑space buffer.

Only the control path (setup, key exchange) involves the CPU; the bulk data movement is performed by the NIC hardware.

Advantages of RDMA

Zero‑copy : Data never moves between user space and kernel space, eliminating extra memory copies.

Kernel bypass : The data path stays in user space, avoiding system calls and context switches.

CPU offload : The remote CPU does not participate in the transfer, freeing compute resources.

Higher bandwidth & lower latency : Direct hardware‑to‑hardware transfers achieve significantly higher throughput and lower end‑to‑end latency than traditional Ethernet sockets.

RDMA Protocol Families

InfiniBand (IB) : A full‑stack, high‑performance protocol defined by the InfiniBand Trade Association (IBTA). Requires dedicated IB hardware and switches.

RDMA over Converged Ethernet (RoCE) : Uses Ethernet at the link layer. RoCE v1 follows the IB transport; RoCE v2 encapsulates IB packets in UDP/IPv4, enabling routing over standard Ethernet switches.

iWARP : Implements RDMA over TCP. Provides reliable delivery on lossy networks but incurs additional CPU and memory overhead due to TCP flow‑control.

Standards and Ecosystem Organizations

InfiniBand Trade Association (IBTA) : Defines and certifies the IB and RoCE specifications, runs interoperability testing, and maintains the protocol standards.

OpenFabrics Alliance (OFA) : Maintains the open‑source OpenFabrics Enterprise Distribution (OFED) software stack that implements drivers, kernel modules, and user‑space libraries for all three RDMA protocols.

Open‑Source RDMA Stack

The Linux kernel contains an active RDMA subsystem under drivers/infiniband/. Source code is hosted at:

https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/

User‑space libraries and tools include: libibverbs – generic verbs API used by applications. rdma-core – core user‑space library, headers, and utilities (GitHub: https://github.com/linux-rdma/rdma-core). perftest – benchmark suite for measuring RDMA bandwidth and latency (GitHub: https://github.com/linux-rdma/perftest). UCX – higher‑level communication framework built on top of RDMA (GitHub: https://github.com/openucx/ucx).

Hardware Implementations

RDMA requires NICs that implement the protocol logic and DMA engines. Prominent vendors are:

Mellanox (NVIDIA) : Connext‑X series (e.g., 100 Gb/s) and the newer Connext‑X‑6 series supporting 200 Gb/s.

Huawei : Kunpeng 920 chips with RoCE‑capable 100 Gb/s adapters.

Adoption in Data Centers

Major cloud and enterprise operators—Microsoft Azure, IBM Cloud, Alibaba Cloud, JD.com—have deployed RDMA in high‑performance computing clusters and latency‑critical data‑center workloads to replace traditional Ethernet sockets.

References

RDMA mailing list archive: http://vger.kernel.org/vger-lists.html#linux-rdma

Linux kernel RDMA repository: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/

linux‑rdma GitHub organization: https://github.com/linux-rdma/

UCX project: https://github.com/openucx/ucx

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

High‑performance computingDMANetworkingRDMAData centerInfiniBandRoCEiWARP
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.