Tagged articles
2 articles
Page 1 of 1
BirdNest Tech Talk
BirdNest Tech Talk
Nov 20, 2024 · Industry Insights

Inside xAI’s 100k‑GPU Colossus: Supermicro Liquid‑Cooled Racks Explained

The article provides a detailed, step‑by‑step tour of xAI’s Colossus supercomputer— a $‑billion AI cluster built in 122 days with 100,000 NVIDIA H100 GPUs—covering Supermicro liquid‑cooled 4U racks, cooling distribution units, power and water infrastructure, storage nodes, CPU servers, 400 GbE networking, and the operational challenges of scaling such a massive system.

AI supercomputingColossusData center architecture
0 likes · 16 min read
Inside xAI’s 100k‑GPU Colossus: Supermicro Liquid‑Cooled Racks Explained
Architects' Tech Alliance
Architects' Tech Alliance
May 3, 2024 · Fundamentals

From OSI Model to RDMA: High‑Performance Networking, Leaf‑Spine Architecture, and Switch Selection

This article examines the evolution of network protocols from the OSI seven‑layer model and TCP/IP to RDMA technologies such as InfiniBand and RoCE, compares traditional three‑tier and leaf‑spine data‑center designs, and evaluates Ethernet, InfiniBand, and RoCE switches for high‑throughput, low‑latency HPC environments.

Data center architectureInfiniBandLeaf-Spine
0 likes · 13 min read
From OSI Model to RDMA: High‑Performance Networking, Leaf‑Spine Architecture, and Switch Selection