Industry Insights 11 min read

Inside AI Servers: PCIe, NVLink, and NVSwitch Driving the Next‑Gen Compute

Based on TrendForce data, AI server shipments are projected to grow at a 12.2% CAGR through 2027, while advances in PCIe switching, retiming chips, and high‑speed GPU interconnects such as NVLink and NVSwitch are reshaping the architecture and performance of next‑generation AI compute platforms.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Inside AI Servers: PCIe, NVLink, and NVSwitch Driving the Next‑Gen Compute

Market Overview

TrendForce reports that AI server shipments reached roughly 130,000 units, accounting for about 1% of total global server shipments. With major vendors like Microsoft, Meta, Baidu, and ByteDance launching generative‑AI products, demand has surged. Forecasts predict a compound annual growth rate (CAGR) of 12.2% from 2023 to 2027, driven by sustained interest in applications such as ChatGPT.

DGX H100: A Benchmark for AI Servers

The NVIDIA DGX H100, released in 2022, is the latest iteration of the DGX family and the core of the DGX SuperPOD. It integrates eight H100 GPUs and 6.4 × 10¹¹ transistors, delivering six times the AI performance of its predecessor, especially with the new FP8 precision. The system provides 900 GB/s of bandwidth and employs an IP network card that also functions as a PCIe expansion switch compliant with PCIe 5.0. Each server includes two CX7 cards (four CX7 chips total) and two 800 Gbps OSFP optical modules. The NVSwitch chip connects the GPUs, offering 18 NVLink links per GPU for a total bidirectional bandwidth of 900 GB/s across four built‑in NVSwitch chips.

PCIe Switching and Retiming Evolution

PCIe switches (or hubs) aggregate multiple PCIe devices onto a single port, overcoming channel limitations. Since Intel introduced PCIe in 2001, the standard has progressed to PCIe 6.0 (64 GT/s) in 2022. In AI servers, signal integrity between CPU and GPU often requires retiming chips; some designs, such as those from Astera Labs, incorporate up to four retimers.

The retiming market is dominated by Parade Technologies, Astera Labs, and Lattice (澜起科技). Lattice is the only mainland Chinese supplier capable of mass‑producing PCIe 4.0 retimers and is advancing PCIe 5.0 retimer development. Other players include Renesas (PCIe 3.0 retimers 89HT0816AP, 89HT0832P), Texas Instruments (DS160PT801, a 16 Gbps 8‑lane PCIe 4.0 retimer), and Microchip (XpressConnect series targeting PCIe 5.0 32 GT/s).

GPU Interconnects: NVLink and NVSwitch

NVIDIA's NVLink, AMD's Infinity Fabric, and Intel's CXL are the primary high‑speed GPU interconnect technologies. NVLink, introduced in 2016, has evolved through four generations. The first generation offered 40 GB/s per link (160 GB/s total per GPU). The second generation (Volta, 2017) increased each link to 50 GB/s, supporting six links per GPU (300 GB/s total). The third generation (Ampere, 2020) doubled bandwidth to 600 GB/s. The fourth generation (Hopper, 2022) uses PAM4 modulation, maintaining 50 GB/s per link but expanding to 18 links per GPU for a total of 900 GB/s.

NVSwitch, first released in 2018 on TSMC 12 nm, provides 18 NVLink 2.0 interfaces per chip. Deploying twelve NVSwitch chips enables a server to interconnect sixteen V100 GPUs efficiently. The latest NVSwitch (third generation, TSMC 4 nm) features 64 NVLink 4.0 ports per chip, delivering 900 GB/s GPU‑to‑GPU bandwidth and enabling GPUs to operate as a unified deep‑learning accelerator.

Conclusion

The rapid development of PCIe switching, retiming, NVLink, and NVSwitch technologies significantly enhances CPU‑GPU and GPU‑GPU communication, shaping a dynamic landscape for AI servers and high‑performance computing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

High‑performance computingIndustry analysisAI serversGPU interconnectNVLinkPCIeNVSwitch
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.