Fundamentals 14 min read

An Introduction to RDMA: Principles, Operation, and Integration with TCP/Ethernet

This article explains the growing need for more efficient data‑center networking, introduces Remote Direct Memory Access (RDMA) technology, describes its working principles, operation types, and how it can be layered over TCP/Ethernet to reduce latency and CPU overhead in high‑performance environments.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
An Introduction to RDMA: Principles, Operation, and Integration with TCP/Ethernet

With the rapid development of network bandwidth and the increasing demand for large‑scale data migration, the growth of network bandwidth far outpaces the capabilities of processing nodes and memory bandwidth, making data‑center network architecture a bottleneck that requires a more efficient communication solution.

Traditional TCP/IP processing involves multiple software layers and extensive copying between system memory, CPU caches, and NIC buffers, consuming significant CPU and memory resources and causing severe latency due to the mismatch among network bandwidth, processor speed, and memory bandwidth.

Figure 1.1 Typical data flow when a host receives traditional Ethernet packets

1 RDMA Overview

To fully exploit the performance advantages of 10‑Gb Ethernet, the CPU must be freed from handling Ethernet communication; the industry initially addressed this with TCP/IP Offload Engines (TOE), which improve throughput but still fall short of high‑performance requirements.

RDMA (Remote Direct Memory Access) allows a computer to directly access the memory of another computer without involving the processor, eliminating unnecessary data copies, reducing bus usage and CPU cycles, and significantly lowering latency.

Figure 1.2 RDMA data‑flow diagram

RDMA originated from InfiniBand and is supported by standards such as MPI, DAPL (including KDAPL and UDAPL). Linux supports KDAPL, and other OSes may follow. While widely adopted in high‑performance computing, RDMA is now increasingly usable in business applications via OS‑level support.

2 RDMA Working Principle

RDMA is a NIC technology that places data directly into remote memory, minimizing processing overhead and bandwidth requirements. It achieves this through hardware‑implemented reliable transport protocols, zero‑copy networking, and kernel‑bypass techniques.

Figure 2.1 Evolution of the RDMA model

Zero‑copy networking lets the NIC transfer data directly between application memory and the network, eliminating copies between application and kernel memory.

Kernel‑bypass allows user‑space applications to issue commands to the NIC without kernel involvement, reducing context switches.

When an application issues an RDMA read/write, no data copying occurs; the operation can be completed entirely in user space or with minimal kernel assistance.

RDMA operations read from or write to remote virtual memory addresses that have been registered with the remote NIC, offloading the remote CPU from the data transfer.

Applications must publish the correct type values and memory region information so that remote peers can safely access the registered buffers.

3 RDMA Operation Types

RNIC‑enabled devices handle all packet generation and reception, removing the host CPU from the data‑transfer path.

RDMA defines four operations: Send, Write, Read, and Terminate. All except Read generate an RDMA message.

4 RDMA over TCP

Ethernet dominates data‑center interconnects; adding RDMA to Ethernet reduces CPU utilization and eases the transition to 10‑Gb speeds without sacrificing performance.

RDMA over TCP moves data directly between application memories of two systems with minimal OS impact and no intermediate copies.

Figure 4.1 RDMA over TCP (Ethernet) data‑flow diagram

RDMA over TCP operates over standard TCP/IP networks and can share physical connections for various transports such as I/O, file systems, block storage, and inter‑processor messaging.

Figure 4.2 RDMA over TCP (Ethernet) protocol stack

The stack’s top three layers form the iWARP protocol suite, ensuring high‑speed interoperability.

RDMA layer converts read/write requests into RDMA messages that are passed to the Direct Data Placement (DDP) layer, which fragments oversized messages for transmission.

Figure 4.3 DDP layer splitting RDMA messages

DDP adds protocol headers and payload information, handling both large‑data (tagged buffer) and small‑control (untagged buffer) transfers.

Figure 4.4 MPA layer splitting DDP messages

The MPA layer adds flags, length, and CRC to form FPDU packets, which the TCP layer then schedules for delivery.

5 RDMA Standards Organizations

In October 2001, major vendors formed the RDMA Consortium to define specifications for TCP/IP‑based RDMA, DDP, and related protocols, collaborating with the IETF to integrate RDMA into the Internet protocol suite.

The first version of the TCP/IP RDMA architecture specification was released in October 2002, with ongoing contributions from industry partners and standards bodies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

High‑performance computingZero CopyTCP/IPRDMAData center
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.