Understanding NVMe over Fabrics: Protocols, RDMA, and Fabric Options
This article explains the NVMe over Fabrics architecture, compares various fabric transports such as FC, InfiniBand, RoCE v2, iWARP and TCP, and details how RDMA‑based technologies like zero‑copy, kernel bypass and CPU‑free transfers give NVMe‑oF its performance advantages while also covering protocol differences, FC‑NVMe, and the emergence of NVMe/TCP.
NVMe over Fabrics (NVMe‑oF) extends the NVMe protocol beyond the PCIe bus by mapping NVMe commands onto network fabrics, aiming to replace SCSI in storage area networks with a high‑performance, low‑latency alternative.
The main fabric options supported by NVMe‑oF are Fibre Channel (FC), InfiniBand, RoCE v2 (RDMA over Converged Ethernet), iWARP, and TCP, with the first three offering native RDMA capabilities that provide inherent performance benefits.
InfiniBand is a next‑generation network protocol that requires specialized NICs and switches; RoCE v2 enables RDMA over standard Ethernet hardware by encapsulating InfiniBand headers inside Ethernet frames; iWARP implements RDMA over TCP, allowing RDMA‑style transfers on conventional Ethernet at the cost of reduced performance.
RDMA’s key advantages are Zero‑Copy (data never traverses multiple protocol layers), Kernel‑Bypass (applications interact directly with the NIC), and None‑CPU (the NIC handles data movement without CPU involvement), all of which reduce latency and CPU load.
NVMe and NVMe‑oF differ primarily in their transport mechanisms: NVMe uses PCIe’s shared memory model, while NVMe‑oF uses a message‑based model to send requests and responses across the network.
NVMe‑oF introduces several protocol extensions compared with NVMe over PCIe, including new naming (e.g., SUBNQN), capsule‑based messaging, expanded Scatter‑Gather List support, discovery and connection mechanisms, modified queue creation commands, the removal of PCIe‑specific interrupt handling, and a shift from PRP to SGL‑only data descriptors.
Fibre Channel‑NVMe (FC‑NVMe) maps the NVMe command set onto the Fibre Channel protocol, leveraging FC’s built‑in reliability, credit‑based flow control, and storage‑oriented features; major vendors such as Broadcom and Cavium provide FC‑NVMe‑compatible HBAs, and newer FC switches already support NVMe‑oF.
RDMA was first introduced in InfiniBand for high‑performance computing; its low‑latency, high‑bandwidth characteristics make it attractive for NVMe‑oF deployments that require extreme performance, though it demands specialized hardware.
NVMe/TCP emerged to address scenarios where RDMA hardware is unavailable or where backward compatibility with existing Ethernet infrastructure is needed; it leverages TCP offloading, virtualization, and software RoCE techniques to approximate RDMA performance while simplifying deployment.
The article concludes by promoting an ebook titled “NVMe Fundamentals and Concepts” for readers who want deeper coverage of the topics discussed.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
