Tagged articles
4 articles
Page 1 of 1
SuanNi
SuanNi
May 8, 2026 · Artificial Intelligence

How OpenAI’s MRC Protocol Redesigns Communication for 100,000‑GPU Clusters

OpenAI, together with AMD, Broadcom, Intel, Microsoft and Nvidia, introduced the Multipath Reliable Connection (MRC) protocol, which splits a single 800 Gb/s link into eight 100 Gb/s planes, enabling full‑mesh connectivity for over 100 k GPUs with fewer switches, lower cost, higher resilience, and dynamic load‑balancing that eliminates congestion and hardware‑failure impacts during large‑scale AI training.

AI networkingGPU clustersMRC
0 likes · 12 min read
How OpenAI’s MRC Protocol Redesigns Communication for 100,000‑GPU Clusters
ITPUB
ITPUB
Sep 28, 2016 · Backend Development

Why Enabling Multipath Routing Shrinks the FIB Table: Uncovering a Hidden Linux Kernel Bug

A long‑standing Linux kernel bug causes the FIB routing hash table to shrink from 256 to 2 entries when multipath routing is enabled, leading to performance degradation; the article explains the faulty macros, traces the communication with the original authors, and advises applying the upstream fix or patching locally.

FIBLinux kernelmultipath
0 likes · 3 min read
Why Enabling Multipath Routing Shrinks the FIB Table: Uncovering a Hidden Linux Kernel Bug