Fundamentals 9 min read

An Introduction to RDMA: Principles, Protocols, Advantages, and Programming Basics

This article provides a comprehensive overview of Remote Direct Memory Access (RDMA), covering its definition, how it differs from traditional networking, core advantages such as zero‑copy and CPU offload, typical use cases, the three main RDMA protocols, deployment requirements, and essential programming concepts and terminology.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
An Introduction to RDMA: Principles, Protocols, Advantages, and Programming Basics

Remote Direct Memory Access (RDMA) allows a device to read or write the memory of a remote host without involving the remote CPU, enabling high‑performance data transfer.

1. RDMA Introduction – RDMA eliminates CPU involvement in data movement, allowing direct memory access across machines.

Traditional network forwarding – In conventional stacks, both sender and receiver rely on the CPU for NIC control, interrupt handling, and packet processing.

When RDMA is used, the CPUs on both ends are almost idle; the NIC performs DMA from user‑space memory to its internal buffers, assembles packets, and sends them over the physical link. The remote NIC strips headers and DMA‑copies the payload directly into user‑space memory.

1.2 Core advantages of RDMA over traditional networking

Zero copy – data is transferred directly between buffers without traversing the network stack.

Kernel bypass – applications issue transfers from user space without context switches.

CPU offload – remote memory is accessed without consuming remote CPU cycles.

1.3 Business scenarios requiring RDMA

Low latency – e.g., HPC, financial services, Web 3.0.

High bandwidth – e.g., HPC, medical imaging, storage/backup, cloud computing.

Low CPU utilization – e.g., HPC, cloud workloads.

1.4 Three RDMA protocols

InfiniBand – a native RDMA‑capable next‑generation network protocol.

RoCE (RDMA over Converged Ethernet) – maps RDMA onto Ethernet frames, requiring special NICs but allowing use of standard Ethernet switches.

iWARP – implements RDMA over TCP; can run on standard Ethernet hardware but sacrifices some performance.

1.5 How to use RDMA – Requires an RDMA‑capable network adapter (e.g., Mellanox Connect‑X). The link layer can be Ethernet or InfiniBand.

2. RDMA terminology and basic flow

The communication sequence involves preparing receive queues, posting SEND work requests, ACK exchange, and completion queue notifications.

3. RDMA programming concepts

Supported communication operations: SEND/RECV, WRITE/READ, ATOMIC, and SRQ_RECV (shared receive queue).

Transport modes:

RC – reliable connection (TCP‑like).

UC – unreliable connection (no retransmission).

UD – unreliable datagram (UDP‑like).

Key concepts include Send Request (SR), Receive Request (RR), Completion Queue (CQ), Memory Registration (MR), Protection Domain (PD), Scatter‑Gather (SG) entries, and polling mechanisms.

The article is based on the “RDMA Technology Research” whitepaper and provides links to additional resources.

High Performance ComputingZero CopyNetwork ProtocolsRDMAdata centerCPU OffloadRemote Direct Memory Access
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.