How RDMA‑Powered SMC‑R Transforms TCP Performance in Data Centers
This article explains why traditional Linux kernel TCP stacks struggle with high‑performance demands, introduces shared‑memory IPC and RDMA concepts, describes the SMC‑R hybrid protocol that transparently replaces TCP sockets, and outlines practical acceleration methods and community contributions.
Editor’s note: TCP is the most widely used network protocol, spanning mobile communications and data centers. For data‑center scenarios, elastic RDMA enables the high‑performance SMC‑R protocol to transparently replace TCP and accelerate applications.
Why a new kernel network protocol stack is needed?
The Linux kernel network stack balances performance, latency, and generality, but cannot provide a silver bullet. Real‑world workloads may require higher performance at the cost of universality, or vice‑versa. Traditional Ethernet‑based solutions offer limited gains, while high‑speed hardware (100G/400G) opens opportunities for TCP‑compatible, higher‑performance alternatives.
Shared‑memory based network communication
Before discussing cross‑host communication, consider intra‑host IPC. Common IPC mechanisms include shared memory, which is the fastest but lacks a unified OS‑level interface.
Shared memory is the fastest IPC method, yet it lacks a standardized OS interface.
Typical shared‑memory IPC flow on a single machine:
Sender writes to a pre‑allocated memory region.
Sender notifies the receiver and updates the write offset.
Receiver reads data based on the new offset.
Receiver updates its read offset.
If a technology could “move” memory between two machines, this high‑performance IPC could extend beyond a single host. Remote Direct Memory Access (RDMA) provides exactly that capability.
Compared with the single‑host shared‑memory flow, the RDMA‑based flow is:
Sender writes to a locally allocated memory region.
RDMA copies that memory to the same location in the remote host’s memory.
RDMA notifies the receiver and updates the write offset.
Receiver reads data using the updated offset.
Receiver updates its read offset via RDMA.
SMC‑R (Shared Memory Communication over RDMA) emerges from this model, offering a TCP‑compatible behavior and socket interface while leveraging RDMA for data‑path performance.
SMC‑R is a hybrid protocol: it uses TCP for connection establishment and control, and RDMA for high‑throughput data transfer. If the RDMA link fails, it falls back to pure TCP, and multiple RNICs enable runtime fault migration for reliability.
RDMA provides a verbs interface; SMC‑R builds a fully TCP‑socket‑compatible kernel interface that can be applied transparently via LD_PRELOAD, eBPF rules, or other mechanisms, effectively replacing TCP sockets with SMC sockets.
Performance tests show up to 57% improvement for Redis without any application changes, demonstrating the practical impact of SMC‑R.
Using SMC‑R to accelerate applications
Three transparent replacement methods are available:
Use LD_PRELOAD to replace the SOCK_STREAM socket creation with AF_SMC.
Apply sysctl at the network‑namespace level to replace all TCP connections within a container or namespace.
Deploy eBPF rules (e.g., based on five‑tuple or process ID) to dynamically match and replace specific connections.
SMC‑R in the Longxi community
Within the Longxi community, ongoing work enhances SMC‑R’s performance, use cases, stability, and transparent replacement. Over six months, contributors have submitted more than 60 patches upstream to the Linux kernel.
For further details, see the code repository hpn-cloud-kernel and the High‑Performance Network SIG at https://openanolis.cn/sig/high-perf-network.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
