How to Build a Low‑Cost RDMA Development Environment with Soft‑RoCE on Linux
This guide walks you through the hardware requirements, cost considerations, and step‑by‑step configuration of a simulated RDMA network using Soft‑RoCE on both Red Hat/CentOS and Ubuntu, including verification with ibv tools and performance testing.
Prerequisites and Hardware Options
Before starting RDMA network programming, you need a suitable environment. The ideal setup is an H100 cluster with at least one rack unit and a server that supports two NICs. If space or budget is limited, a minimum of two RDMA NICs is required.
Major RDMA NIC vendors are Marvell (which acquired QLogic), Intel, and Mellanox (now part of NVIDIA). High‑end cards such as NVIDIA's ConnectX‑7 series cost around $2,000 each for 100 G/200 G/400 G bandwidth.
If you cannot afford physical RDMA cards, you can simulate an RDMA NIC on existing Ethernet adapters using Soft‑RoCE (also known as RXE).
What Is Soft‑RoCE?
Soft‑RoCE is a software implementation of RoCE that allows RDMA over any Ethernet NIC, regardless of hardware acceleration. It has been merged into the Linux kernel since Red Hat Enterprise Linux 7.4, and the user‑space driver is part of the rdma‑core package. The driver is also referred to as RXE.
Older Red Hat 7.x manuals mention the rxe_cfg command for configuring RXE, but from RHEL‑8.2 onward the rdma command replaces it. The same applies to newer Ubuntu releases.
Configuration on Red Hat/CentOS
Red Hat 8 documentation describes Soft‑RoCE configuration (see reference [1]). First install the required libraries:
yum install iproute libibverbs libibverbs-utils infiniband-diagsCheck the current RDMA links (none are shown before RXE is configured): # rdma link show Load the RXE module and create a virtual device rxe0 on the physical NIC xgbe0:
# modprobe rdma_rxe
# rdma link add mlx5_0 type rxe netdev xgbe0If the rdma link add command is unavailable, fall back to rxe_cfg: rxe_cfg add xgbe0 Verify the link again; the virtual device should now appear:
# rdma link show
0/1: mlx5_0/1: state ACTIVE physical_state LINK_UP netdev xgbe0
# rxe_cfg status
rdma_rxe module not loaded
Name Link Driver Speed NMTU IPv4_addr RDEV RMTU
xgbe0 yes mlx5_core 1500 192.168.1.1Start the RDMA service:
# systemctl start rdma.service
# rxe_cfg startList loaded RDMA‑related kernel modules to confirm the drivers are present:
# lsmod | grep -E 'rdma|ib'
rdma_rxe 16384 0
svcrdma 16384 0
... (additional modules omitted for brevity) ...Inspect the virtual device with ibv_devices and ibstat:
# ibv_devices
device node GUID
------ ----------------
mlx5_0 b8cef60300e2048
# ibstat
CA 'mlx5_0'
CA type: MT4117
Number of ports: 1
... (details omitted) ...
Port 1:
State: Active
Physical state: LinkUp
Rate: 25
...Validate the setup with ibv_rc_pingpong as a server:
# ibv_rc_pingpong -d mlx5_0 -g 0
local address: LID 0x0000, QPN 0x00015b, PSN 0x33a37e, GID fe80::bace:f6ff:fece:2048And as a client (replace the IP with the server’s address):
# ibv_rc_pingpong -d mlx5_0 -g 0 192.168.1.1
local address: LID 0x0000, QPN 0x00015c, PSN 0x514e35, GID fe80::bace:f6ff:fece:2048
remote address: LID 0x0000, QPN 0x00015b, PSN 0x33a37e, GID fe80::bace:f6ff:fece:2048
8192000 bytes in 0.00 seconds = 15142.33 Mbit/sec
1000 iters in 0.00 seconds = 4.33 usec/iterSuccessful output indicates that the client and server have established an RDMA connection and transferred data.
Configuration on Ubuntu
Install the necessary packages:
# apt-get install libibverbs1 ibverbs-utils librdmacm1 libibumad3 ibverbs-providers rdma-coreAfter installation, rdma link show returns no results, so a virtual NIC must be created:
# modprobe rdma_rxe
# rdma link add rxe_0 type rxe netdev enp2s0
# rdma link show
link rxe_0/1 state ACTIVE physical_state LINK_UP netdev enp2s0List the device with ibv_devices and inspect it using ibstat:
# ibv_devices
device node GUID
------ ----------------
rxe_0 02e04ffffe290bd5
# ibstat
CA 'rxe_0'
CA type:
Number of ports: 1
... (details omitted) ...
Port 1:
State: Active
Physical state: LinkUp
Rate: 2.5
...Test the virtual device with ibv_rc_pingpong. In one terminal start the server:
# ibv_rc_pingpong -d rxe_0 -g 0
local address: LID 0x0000, QPN 0x000011, PSN 0x966045, GID fe80::2e0:4fff:fe29:bd5
remote address: LID 0x0000, QPN 0x000012, PSN 0xa069a9, GID fe80::2e0:4fff:fe29:bd5
8192000 bytes in 0.04 seconds = 1770.19 Mbit/sec
1000 iters in 0.04 seconds = 37.02 usec/iterIn another terminal start the client (replace the IP with the server’s address):
# ibv_rc_pingpong -d rxe_0 -g 0 192.168.1.5
local address: LID 0x0000, QPN 0x000012, PSN 0xa069a9, GID fe80::2e0:4fff:fe29:bd5
remote address: LID 0x0000, QPN 0x000011, PSN 0x966045, GID fe80::2e0:4fff:fe29:bd5
8192000 bytes in 0.04 seconds = 1772.73 Mbit/sec
1000 iters in 0.04 seconds = 36.97 usec/iterSuccessful data transfer confirms that the simulated RDMA environment works on inexpensive hardware.
Next Steps
Having built a functional RDMA development environment, future articles will cover common RDMA command‑line tools, performance testing utilities, and programming with the libibverbs API.
References
[1] Configuring Soft‑RoCE: https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/configuring_infiniband_and_rdma_networks/configuring-roce_configuring-infiniband-and-rdma-networks#configuring-soft-roce_configuring-roce
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
