Understanding RDMA: Principles, Advantages, and Implementation Details
This article explains how RDMA (Remote Direct Memory Access) technology, originating from InfiniBand and extended to Ethernet (RoCE) and TCP/IP (iWARP), provides ultra‑low latency, high throughput, and minimal CPU usage for high‑performance computing and big‑data applications by bypassing traditional OS and protocol stack processing.
High‑performance computing, big‑data analytics and bursty I/O applications demand lower latency and CPU usage than traditional TCP/IP stacks can provide.
RDMA (Remote Direct Memory Access) enables direct memory transfers between endpoints over the network, bypassing the OS and protocol stack, thus achieving microsecond‑level latency, high throughput, and minimal CPU overhead.
Originally part of InfiniBand, RDMA has been extended to Ethernet via RoCE and to TCP/IP via iWARP, with standards defined by RDMAC, IBTA and the Open Fabric Alliance (OFA).
InfiniBand achieves low latency through cut‑through switching, credit‑based flow control, hardware offload, and small buffers.
RoCE provides InfiniBand‑like performance on Ethernet, requiring DCB support, while iWARP leverages TCP/IP at higher hardware cost.
The RDMA software stack (e.g., OFED) offers Verbs APIs and UL‑P layers that allow existing applications to use RDMA without code changes.
RDMA communication uses Queue Pairs (QP) composed of Send and Receive Queues, Completion Queues, and Work Requests that are transformed into Work Queue Elements for asynchronous NIC processing.
Two operation modes exist: two‑sided SEND/RECEIVE requiring remote participation, and one‑sided READ/WRITE allowing direct remote memory access without remote software involvement.
Typical data transfer flows for both two‑sided and one‑sided operations are described, highlighting zero‑copy and kernel bypass benefits.
In summary, RDMA reduces latency from tens of microseconds to a few microseconds, consumes little CPU, and, combined with high‑bandwidth, loss‑free networks (InfiniBand or modern Ethernet), drives the adoption of RoCE, iWARP, and InfiniBand in future high‑performance systems.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.