Inside the GPU Server: Architecture of A100/A800 and H100/H800 Nodes
This article provides a detailed technical breakdown of modern multi‑GPU server nodes, covering component composition, storage network cards, NVSwitch interconnects, bandwidth calculations, and the architectural differences between NVIDIA A100/A800 and H100/H800 configurations for AI training workloads.
GPU Server Node Overview
Large‑scale model training typically uses clusters where each server hosts multiple GPUs. This summary describes the hardware composition and interconnect topology of common 8‑GPU nodes based on NVIDIA A100/A800 and H100/H800 GPUs.
8‑GPU A100/A800 Node Architecture
Two CPU sockets (NUMA) with attached memory : General‑purpose compute.
Two storage network adapter cards : Access to distributed storage.
Four PCIe Gen4 switch chips : Provide high‑speed PCIe routing.
Six NVSwitch chips : Enable full‑mesh GPU‑to‑GPU communication.
Eight GPUs (A100 or A800) : Parallel AI processing units.
Eight GPU‑dedicated NICs : Optimize intra‑node data transfer.
Typical topology diagram:
Storage Network Card Role
Efficient read/write to distributed storage, essential for feeding training data and checkpointing.
Supports node management functions such as remote SSH access, performance monitoring, and data collection.
While the vendor recommends BF3 DPU, cost‑effective alternatives (e.g., RoCE) or high‑performance InfiniBand can be used.
NVSwitch Network Structure
In a full‑mesh topology each GPU connects directly to every other GPU via NVSwitch chips. An 8‑GPU A100 node uses six NVSwitch chips.
Bandwidth (NVLink 3, 50 GB/s per lane):
12 NVLink lanes per GPU → 12 × 50 GB/s = 600 GB/s bidirectional (300 GB/s unidirectional) for A100.
8 NVLink lanes per GPU → 8 × 50 GB/s = 400 GB/s bidirectional (200 GB/s unidirectional) for A800.
Connection Types in the Topology
GPU‑to‑GPU (NV8) : Eight NVLink connections per GPU pair.
NIC connections :
NODE : Within the same CPU socket, no NUMA crossing.
SYS : Across CPU sockets, crossing NUMA.
GPU‑to‑NIC :
NODE : Same CPU socket and same PCIe switch.
NNODE : Same CPU socket but different PCIe switch.
SYS : Different CPU sockets, crossing NUMA and PCIe switches.
GPU Node Interconnect Architecture
Compute and Storage Networks
The compute network connects GPU nodes for parallel computation, data exchange, and coordinated execution. The storage network links GPU nodes to distributed storage systems for massive data ingest and result output.
RDMA Importance
Remote Direct Memory Access (RDMA) is critical for high‑performance AI workloads. Choosing between RoCEv2 (cost‑effective) and InfiniBand (peak performance) depends on budget and performance requirements.
Bandwidth Bottlenecks
Intra‑host GPU‑GPU via NVLink: 600 GB/s bidirectional (300 GB/s unidirectional).
GPU‑to‑NIC within the same host (PCIe Gen4 switch): 64 GB/s bidirectional (32 GB/s unidirectional).
Inter‑host GPU‑GPU via NIC: typical NIC provides 100 Gbps (12.5 GB/s) unidirectional, far lower than intra‑host bandwidth.
Using a 400 Gbps NIC yields little benefit unless the rest of the system supports PCIe Gen5 speeds.
8‑GPU H100/H800 Node Architecture
H100 Node Hardware Topology
Each H100 host contains four GPU chips (two fewer than the A100 configuration).
H100 chips are fabricated on a 4 nm process and feature 18 Gen4 NVLink connections per chip, delivering 900 GB/s bidirectional bandwidth.
H100 GPU Chip Details
Manufactured with 4 nm technology.
Bottom row hosts 18 Gen4 NVLink links: 18 × 25 GB/s = 900 GB/s bidirectional.
Central blue region is the L2 cache for fast temporary storage.
Side regions integrate HBM (high‑bandwidth memory) chips for graphics memory.
Source: https://community.fs.com/cn/article/unveiling-the-foundations-of-gpu-computing1.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
