Tagged articles

52 articles

Page 1 of 1

Architects' Tech Alliance

Nov 3, 2025 · Artificial Intelligence

What Nvidia’s New Blackwell & Rubin GPUs Reveal About the Future of AI Compute

Nvidia’s latest GTC briefing details the Blackwell and Rubin GPU roadmaps, highlighting massive GPU shipments, new NVLink 6.0 interconnects, 448 Gbps SerDes, and architectural innovations aimed at boosting AI compute performance, efficiency, and scalability across data‑center workloads.

AI computeBlackwellGPU architecture

0 likes · 6 min read

What Nvidia’s New Blackwell & Rubin GPUs Reveal About the Future of AI Compute

BirdNest Tech Talk

Oct 12, 2025 · Artificial Intelligence

What Happens When a Token Travels Through GPU Villages via RDMA and NVLink?

The article uses a whimsical journey to illustrate how token data is dispatched across GPU clusters—detailing functions like get_dispatch_layout, notify_dispatch, and combine_token, showing RDMA and NVLink pathways, performance experiments, and the final verification of token integrity.

AIDistributed SystemsGPU

0 likes · 5 min read

What Happens When a Token Travels Through GPU Villages via RDMA and NVLink?

Architects' Tech Alliance

Oct 11, 2025 · Artificial Intelligence

What Is a SuperNode? Inside AI‑Optimized High‑Performance Compute Pods

The article explains the concept of SuperNode (SuperPod) as a new AI‑focused compute infrastructure, outlines its high‑density integration, ultra‑fast interconnects, and unified resource management, and compares three leading implementations from NVIDIA, Huawei, and the ETH‑X project.

AI InfrastructureAI supernodeDGX SuperPOD

0 likes · 11 min read

What Is a SuperNode? Inside AI‑Optimized High‑Performance Compute Pods

Architects' Tech Alliance

Oct 11, 2025 · Artificial Intelligence

Why NVLink Beats PCIe for AI: Deep Dive into GPU Interconnect Technologies

This article examines the architectural differences between Scale‑Out and Scale‑Up networking, compares PCIe, NVLink, UALink, Infiniband and RoCE, and explains why high‑bandwidth, low‑latency GPU interconnects like NVLink are essential for modern AI and HPC workloads.

AI accelerationGPU interconnectHigh‑performance computing

0 likes · 27 min read

Why NVLink Beats PCIe for AI: Deep Dive into GPU Interconnect Technologies

Architects' Tech Alliance

Sep 29, 2025 · Artificial Intelligence

How NVLink and NVSwitch Power AI’s Next‑Gen High‑Performance Networks

This article, part of the 2025 AI Network Technology Whitepaper, classifies AI high‑performance networking into Scale‑Up, Scale‑Out, and frontier breakthroughs, then dives deep into NVLink’s evolution, technical features, NVSwitch’s full‑mesh architecture, and the newly opened NVLink Fusion ecosystem.

AI networkingGPU interconnectHigh‑performance computing

0 likes · 8 min read

How NVLink and NVSwitch Power AI’s Next‑Gen High‑Performance Networks

Architects' Tech Alliance

Sep 15, 2025 · Artificial Intelligence

Why NVLink Beats PCIe for AI Training: A Deep Dive into GPU Interconnects

This article examines the differences between Scale‑Out and Scale‑Up networking in AI compute clusters, comparing PCIe, Ethernet, InfiniBand, NVLink, UALink, and emerging standards like UB‑Mesh, and explains how each technology impacts bandwidth, latency, scalability, and cost for large‑scale model training.

AI trainingGPU interconnectNVLink

0 likes · 28 min read

Why NVLink Beats PCIe for AI Training: A Deep Dive into GPU Interconnects

Architects' Tech Alliance

Sep 14, 2025 · Artificial Intelligence

Why Nvidia’s Blackwell GPUs Are Redefining AI Performance

The article analyzes Nvidia's 2023 Blackwell GPU series and GB200 NVL72 architecture, detailing their advanced 3‑4nm manufacturing, redesigned CUDA cores, next‑gen ray‑tracing and DLSS upgrades, massive compute and memory bandwidth gains, NVLink Gen5 improvements, and the diverse GB200 product configurations for high‑performance AI workloads.

AI accelerationBlackwell GPUGPU architecture

0 likes · 7 min read

Why Nvidia’s Blackwell GPUs Are Redefining AI Performance

Architects' Tech Alliance

Aug 10, 2025 · Artificial Intelligence

From Volta to Blackwell: How NVIDIA GPUs Evolved for Deep Learning

This article traces the evolution of NVIDIA's GPU architectures—from Volta's pioneering Tensor Cores through Turing, Ampere, Hopper, and the latest Blackwell—highlighting key innovations such as mixed‑precision support, NVLink, and specialized Tensor Core designs that have dramatically boosted AI training and inference performance.

AI hardwareDeep LearningGPU architecture

0 likes · 10 min read

From Volta to Blackwell: How NVIDIA GPUs Evolved for Deep Learning

Architects' Tech Alliance

Jul 19, 2025 · Artificial Intelligence

Best GPU Cluster Network for Large‑Scale AI: NVLink, InfiniBand, RoCE & DDC

This article compares the main networking technologies used in large‑scale AI GPU clusters—NVLink, InfiniBand, RoCE Ethernet, and the emerging DDC full‑schedule fabric—examining latency, lossless transmission, congestion control, cost, power and scalability to help engineers choose the optimal solution for training massive language models.

AI trainingDDCData center

0 likes · 15 min read

Best GPU Cluster Network for Large‑Scale AI: NVLink, InfiniBand, RoCE & DDC

Instant Consumer Technology Team

Jul 11, 2025 · Artificial Intelligence

Why NVLink Boosts Multi‑GPU Inference: Tensor Parallelism Explained

A recent migration of a multimodal image inference system from an internal network to a cloud environment revealed that NVLink bridges dramatically improve multi‑GPU inference speed by reducing inter‑GPU communication overhead, while tensor‑parallel and data‑parallel strategies each have distinct trade‑offs for model deployment.

AI PerformanceData ParallelGPU inference

0 likes · 11 min read

Why NVLink Boosts Multi‑GPU Inference: Tensor Parallelism Explained

Architects' Tech Alliance

May 26, 2025 · Artificial Intelligence

NVLink Fusion: NVIDIA’s High‑Bandwidth Interconnect for Heterogeneous AI Computing

NVLink Fusion, unveiled at Computex 2025, extends NVIDIA’s NVLink technology to enable high‑bandwidth, low‑latency connections between CPUs and GPUs or third‑party accelerators, offering up to 900 GB/s bandwidth, flexible heterogeneous configurations, ecosystem expansion, performance gains for AI training and inference, and potential cost reductions.

AICPUData center

0 likes · 12 min read

NVLink Fusion: NVIDIA’s High‑Bandwidth Interconnect for Heterogeneous AI Computing

Architects' Tech Alliance

May 13, 2025 · Industry Insights

How NVIDIA Builds AI Supercomputers: From H100 to GH200 and GB200 SuperPods

This article analyzes NVIDIA's evolving AI supercomputer architectures—detailing the H100‑based 256‑GPU SuperPod, the GH200‑based 256‑GPU SuperPod with integrated Grace CPU, and the GB200‑based 576‑GPU SuperPod—examining their NVLink and InfiniBand topologies, bandwidth limits, and scalability challenges.

AIGPUHPC

0 likes · 11 min read

How NVIDIA Builds AI Supercomputers: From H100 to GH200 and GB200 SuperPods

Architects' Tech Alliance

Apr 28, 2025 · Artificial Intelligence

NVLink High‑Speed Interconnect: Architecture, Evolution, and Performance

NVLink, NVIDIA's high‑bandwidth interconnect introduced with the P100 GPU, replaces PCIe by offering significantly higher data rates and lower latency for GPU‑GPU and GPU‑CPU communication, and has evolved through multiple generations to support modern AI and high‑performance computing workloads.

AI accelerationGPU interconnectNVLink

0 likes · 9 min read

NVLink High‑Speed Interconnect: Architecture, Evolution, and Performance

Architects' Tech Alliance

Apr 8, 2025 · Artificial Intelligence

How NVSwitch Revolutionizes Multi‑GPU Interconnect for AI Workloads

This article examines NVIDIA's NVSwitch technology, explaining why it was needed, how it builds on NVLink to overcome PCIe bottlenecks, tracing its evolution from Pascal to the third‑generation design, and detailing its architectural features, scalability, full‑duplex bandwidth, non‑blocking communication, and optimized network topologies for high‑performance AI and HPC systems.

AI hardwareGPU interconnectHigh‑performance computing

0 likes · 9 min read

How NVSwitch Revolutionizes Multi‑GPU Interconnect for AI Workloads

Architects' Tech Alliance

Apr 6, 2025 · Fundamentals

PCIe vs NVLink: How Modern GPU Interconnects Power AI Training

As AI models grow to trillion‑parameter scales, training them demands massive GPU clusters whose performance is increasingly limited by network bandwidth; this article examines why traditional PCIe interconnects become bottlenecks and how NVIDIA's NVLink and NVSwitch technologies dramatically improve multi‑GPU communication and overall system efficiency.

AI trainingGPUHigh‑performance computing

0 likes · 12 min read

PCIe vs NVLink: How Modern GPU Interconnects Power AI Training

Architects' Tech Alliance

Apr 3, 2025 · Artificial Intelligence

Why NVLink and NVSwitch Are Essential for Training Massive AI Models

Training today's massive AI foundation models demands extensive GPU resources and sophisticated multi‑GPU communication, making technologies like NVLink and NVSwitch crucial for efficient distributed training, while data‑parallel and model‑parallel strategies together optimize performance across large‑scale hardware clusters.

AIDistributed TrainingGPU

0 likes · 8 min read

Why NVLink and NVSwitch Are Essential for Training Massive AI Models

Baidu Intelligent Cloud Tech Hub

Mar 3, 2025 · Cloud Computing

How Baidu Cloud Optimizes GPU Servers for AI Workloads

This article explains the design and implementation of GPU cloud servers, covering data processing pipelines, hardware selection, topology, interconnect technologies, virtualization, multi‑GPU communication methods, and Baidu's practical solutions for both virtualized and bare‑metal instances to boost AI inference and training performance.

AIGPUNVLink

0 likes · 29 min read

How Baidu Cloud Optimizes GPU Servers for AI Workloads

Feb 8, 2025 · Artificial Intelligence

Why 8‑GPU Servers Are Essential for LLM Training and Which Interconnect Wins

With modern large‑language‑model workloads demanding massive parallelism, 8‑GPU servers have become the norm; this article explains the roles of CPUs, compares GPU‑to‑GPU interconnect options—including PCIe direct, PCIe Switch, NVLink, and NVSwitch—detailing their architectures, bandwidths, topologies, and trade‑offs for AI training.

8-GPU serverAI trainingGPU interconnect

0 likes · 14 min read

Why 8‑GPU Servers Are Essential for LLM Training and Which Interconnect Wins

Baobao Algorithm Notes

Jan 14, 2025 · Industry Insights

Why NVLink Supercharges Llama 3 70B Inference: A Deep Performance Breakdown

An in‑depth analysis shows that NVLink 3.0 reduces all‑reduce communication latency for Llama 3 70B inference from over 1.8 seconds to under 100 ms, delivering a dramatic speedup compared with PCIe 4.0 and highlighting the critical role of high‑bandwidth interconnects in large‑model deployments.

All-reduceGPU inferenceLlama 3

0 likes · 5 min read

Why NVLink Supercharges Llama 3 70B Inference: A Deep Performance Breakdown

Linux Kernel Journey

Dec 22, 2024 · Artificial Intelligence

Understanding GPU Monitoring: Utilization Metrics and Failure Scenarios

This article systematically reviews GPU monitoring for large‑scale AI training, covering MFU/HFU definitions, key DCGM metrics, NVLink bandwidth, common failure codes such as Xid and SXid, experimental insights on T4 and H100 GPUs, and practical case studies for diagnosing and mitigating performance drops.

DCGMGPU failuresGPU monitoring

0 likes · 26 min read

Understanding GPU Monitoring: Utilization Metrics and Failure Scenarios

Architects' Tech Alliance

Dec 11, 2024 · Fundamentals

Unlocking GPU Computing: PCIe, NVLink, NVSwitch, and HBM Explained

This article breaks down the core components of high‑performance GPU servers—including PCIe switch chips, the evolution of NVLink from version 1.0 to 4.0, NVSwitch architecture, HBM memory tiers, and the nuances of bandwidth units—providing a comprehensive technical foundation for large‑scale model training.

GPU computingHBMHigh‑performance computing

0 likes · 10 min read

Unlocking GPU Computing: PCIe, NVLink, NVSwitch, and HBM Explained

Architects' Tech Alliance

Sep 23, 2024 · Artificial Intelligence

Venado Supercomputer: Architecture, Performance, and Design Insights

The Venado supercomputer, built for Los Alamos National Laboratory, combines Nvidia Grace CPUs with Hopper GPUs, leverages high‑bandwidth memory and Slingshot interconnects, and targets a balanced 80/20 CPU‑GPU workload split to support demanding AI and HPC applications.

Grace CPUHPCLos Alamos

0 likes · 13 min read

Venado Supercomputer: Architecture, Performance, and Design Insights

Architects' Tech Alliance

Sep 3, 2024 · Industry Insights

How NVIDIA Grace Hopper Superchip Redefines HPC and AI Performance

The article provides an in‑depth technical overview of NVIDIA's Grace Hopper superchip, detailing its heterogeneous CPU‑GPU architecture, high‑bandwidth NVLink‑C2C interconnect, unified memory model, programming support, and system‑level scaling features that together deliver unprecedented performance for high‑performance computing and large‑scale AI workloads.

AIGrace HopperHPC

0 likes · 20 min read

How NVIDIA Grace Hopper Superchip Redefines HPC and AI Performance

Architects' Tech Alliance

Aug 29, 2024 · Industry Insights

How NVIDIA Builds 256‑GPU and 576‑GPU SuperPods with H100, GH200, and GB200 Interconnects

The article analyzes NVIDIA's DGX SuperPOD architectures across three GPU generations—H100, GH200, and GB200—detailing their NVLink/NVSwitch topologies, bandwidth calculations, scalability limits, and the practical challenges of constructing 256‑GPU and 576‑GPU supercomputing clusters.

Data centerGPUHigh‑performance computing

0 likes · 11 min read

How NVIDIA Builds 256‑GPU and 576‑GPU SuperPods with H100, GH200, and GB200 Interconnects

Architects' Tech Alliance

Jul 7, 2024 · Operations

Overview of Popular GPU/TPU Cluster Networking Technologies: NVLink, InfiniBand, RoCE, and DDC

This article reviews the main GPU/TPU cluster networking solutions—including NVLink, InfiniBand, RoCE Ethernet, and DDC full‑schedule fabrics—examining their latency, loss‑free transmission, congestion control, cost, scalability, and suitability for large‑scale LLM training workloads.

AI trainingDDCGPU networking

0 likes · 16 min read

Overview of Popular GPU/TPU Cluster Networking Technologies: NVLink, InfiniBand, RoCE, and DDC

Architects' Tech Alliance

Jun 13, 2024 · Industry Insights

How Nvidia’s New Blackwell GPUs and NVLink Redefine AI Acceleration in 2024

The article analyzes Nvidia's latest AI‑focused hardware and software breakthroughs showcased at ComputeX 2024, detailing how GPU‑CPU hybrid architectures, new libraries, and high‑speed interconnects like NVLink dramatically boost performance while keeping power and cost growth modest.

AI accelerationBlackwellDGX

0 likes · 12 min read

How Nvidia’s New Blackwell GPUs and NVLink Redefine AI Acceleration in 2024

Architects' Tech Alliance

Jun 10, 2024 · Artificial Intelligence

NVLink vs PCIe GPUs: Which Nvidia AI Server Fits Your Workload?

This article compares Nvidia's NVLink (SXM) and PCIe GPU versions for AI servers, detailing their architectures, bandwidth, power consumption, and ideal use cases, helping readers choose the optimal configuration based on performance needs and budget constraints.

AI serversGPUNVLink

0 likes · 8 min read

NVLink vs PCIe GPUs: Which Nvidia AI Server Fits Your Workload?

IT Services Circle

Jun 6, 2024 · Artificial Intelligence

Nvidia Unveils Blackwell GPU and AI Supercomputing Roadmap

Nvidia’s latest Blackwell GPU, presented by Jensen Huang, promises unprecedented performance and energy efficiency for large‑scale AI models, while the company also showcases accelerated computing, NVLink interconnects, AI‑optimized DGX servers, the NIM platform for rapid LLM deployment, and ambitious projects such as Earth‑2 digital twins and next‑generation embodied AI robots.

AIBlackwellGPU

0 likes · 18 min read

Nvidia Unveils Blackwell GPU and AI Supercomputing Roadmap

Architects' Tech Alliance

May 15, 2024 · Artificial Intelligence

Detailed Overview of GPU Server Architectures: A100/A800 and H100/H800 Nodes

This article provides a comprehensive technical overview of large‑scale GPU server architectures, detailing the component topology of 8‑GPU A100/A800 and H100/H800 nodes, explaining storage network cards, NVSwitch interconnects, bandwidth calculations, and the trade‑offs between RoCEv2 and InfiniBand for AI workloads.

GPUHigh‑performance computingNVLink

0 likes · 13 min read

Detailed Overview of GPU Server Architectures: A100/A800 and H100/H800 Nodes

Architects' Tech Alliance

May 14, 2024 · Fundamentals

Fundamentals of GPU Computing: PCIe, NVLink, NVSwitch, and HBM

This article provides a comprehensive overview of the core components and terminology of large‑scale GPU computing, covering GPU server architecture, PCIe interconnects, NVLink generations, NVSwitch, high‑bandwidth memory (HBM), and bandwidth unit considerations for AI and HPC workloads.

AI hardwareGPU computingHBM

0 likes · 11 min read

Fundamentals of GPU Computing: PCIe, NVLink, NVSwitch, and HBM

Architects' Tech Alliance

May 11, 2024 · Industry Insights

Why Network Interconnects Are the New Bottleneck for Large‑Model AI Training

The rapid growth of AI large‑model training and inference is driving unprecedented demand for compute and high‑speed networking, prompting a shift from traditional GPU clusters to super‑pooled intelligent computing centers that must balance multiple intra‑ and inter‑node interconnect solutions such as NVLink, OAM/UBB, InfiniBand and RoCEv2.

AIData centerInfiniBand

0 likes · 6 min read

Why Network Interconnects Are the New Bottleneck for Large‑Model AI Training

Architects' Tech Alliance

May 1, 2024 · Industry Insights

How NVIDIA’s Blackwell Platform Redefines AI Supercomputing Networks

The article examines NVIDIA’s Blackwell platform network architecture, detailing the fifth‑generation NVLink, sixth‑generation PCIe, 800 Gb/s InfiniBand and Ethernet adapters, the DGX B200 and GB200 configurations, new IB and Ethernet switches, and the implications of increased optical module demands for large‑scale AI clusters.

AI supercomputingBlackwellDGX

0 likes · 10 min read

How NVIDIA’s Blackwell Platform Redefines AI Supercomputing Networks

Architects' Tech Alliance

Apr 23, 2024 · Industry Insights

Which GPU Cluster Network Wins for LLM Training? NVLink, InfiniBand, RoCE & DDC Compared

This article analyzes the main GPU/TPU cluster networking options—NVLink, InfiniBand, RoCE Ethernet, and DDC full‑schedule fabrics—examining latency, lossless transmission, congestion control, cost, power, and scalability to determine their suitability for large‑scale LLM training.

DDCData center fabricsGPU networking

0 likes · 18 min read

Which GPU Cluster Network Wins for LLM Training? NVLink, InfiniBand, RoCE & DDC Compared

Architects' Tech Alliance

Apr 16, 2024 · Industry Insights

Inside AI Servers: PCIe, NVLink, and NVSwitch Driving the Next‑Gen Compute

Based on TrendForce data, AI server shipments are projected to grow at a 12.2% CAGR through 2027, while advances in PCIe switching, retiming chips, and high‑speed GPU interconnects such as NVLink and NVSwitch are reshaping the architecture and performance of next‑generation AI compute platforms.

AI serversGPU interconnectHigh‑performance computing

0 likes · 11 min read

Inside AI Servers: PCIe, NVLink, and NVSwitch Driving the Next‑Gen Compute

Architects' Tech Alliance

Apr 15, 2024 · Industry Insights

How NVIDIA NVLink is Transforming HPC and AI: Architecture, Switches, and Network Comparisons

This article provides an in‑depth technical analysis of NVIDIA NVLink, covering its evolution, the NVSwitch chip, NVLink‑enabled servers and switches, and a performance comparison with InfiniBand networks, highlighting its impact on high‑performance computing and artificial intelligence workloads.

GPU interconnectHPCNVLink

0 likes · 9 min read

How NVIDIA NVLink is Transforming HPC and AI: Architecture, Switches, and Network Comparisons

Architects' Tech Alliance

Apr 15, 2024 · Artificial Intelligence

Decoding GPU Server Topologies: From PCIe to NVLink for Large‑Model Training

This article provides a detailed technical overview of modern multi‑GPU server architectures—including PCIe switches, NVLink, NVSwitch, and HBM—explaining their hardware topologies, bandwidth characteristics, monitoring methods, and network choices to help engineers design efficient AI training clusters.

AI trainingGPUHBM

0 likes · 18 min read

Decoding GPU Server Topologies: From PCIe to NVLink for Large‑Model Training

Architects' Tech Alliance

Apr 8, 2024 · Fundamentals

Unlocking GPU Server Architecture: PCIe, NVLink, NVSwitch & HBM Explained

This article provides a comprehensive breakdown of high‑performance GPU server infrastructure, covering PCIe generations, NVLink evolution, NVSwitch and NVLink switches, HBM memory technologies, and bandwidth measurement units, helping readers understand the hardware connections and performance considerations essential for large‑scale model training.

GPU architectureHBMHigh‑performance computing

0 likes · 10 min read

Unlocking GPU Server Architecture: PCIe, NVLink, NVSwitch & HBM Explained

Architects' Tech Alliance

Apr 6, 2024 · Industry Insights

How NVIDIA’s Blackwell GB200 NVL72 Redefines AI Compute with 10 TB/s Interconnect

The article analyses NVIDIA’s new Blackwell platform, focusing on the GB200 NVL72 GPU and its 10 TB/s NVLink‑C2C interconnect, detailing massive training and inference speedups, rack‑level DGX SuperPOD architecture, copper‑cable trends, and the broader impact on AI‑driven data‑center workloads.

AIBlackwellGPU

0 likes · 13 min read

How NVIDIA’s Blackwell GB200 NVL72 Redefines AI Compute with 10 TB/s Interconnect

Architects' Tech Alliance

Apr 2, 2024 · Artificial Intelligence

Evolution and Forecast of Nvidia NVLink, NVLink C2C, and B100/X100 GPU Architectures

The article analyses the historical evolution of Nvidia's NVLink and NVLink C2C interconnect technologies, compares them with PCIe, Ethernet and InfiniBand, and uses these trends to predict future AI‑chip architectures such as the B100 and X100 GPUs, highlighting design trade‑offs and packaging challenges.

AI ChipB100GPU architecture

0 likes · 15 min read

Evolution and Forecast of Nvidia NVLink, NVLink C2C, and B100/X100 GPU Architectures

Architects' Tech Alliance

Mar 31, 2024 · Industry Insights

How Many Optical Modules Do A100, H100, and GH200 AI Clusters Really Need?

This article analyzes the evolving data‑center network architectures for large AI clusters, detailing leaf‑spine and Fat‑Tree designs, NVLink interconnects, and calculating the precise optical‑module requirements for NVIDIA A100, H100, and GH200 deployments, while also comparing industry examples from Meta, AWS, and Google.

AI clustersFat-TreeNVLink

0 likes · 12 min read

How Many Optical Modules Do A100, H100, and GH200 AI Clusters Really Need?

Architects' Tech Alliance

Mar 24, 2024 · Artificial Intelligence

NVLink vs PCIe GPUs: Which NVIDIA Server GPU Wins for Your AI Workload?

This article compares NVIDIA's NVLink (SXM) and PCIe GPU versions for AI servers, detailing their architectures, bandwidth, power consumption, and ideal use cases, and provides guidance on selecting the right GPU based on workload size, flexibility, and cost considerations.

AI serversGPUNVLink

0 likes · 9 min read

NVLink vs PCIe GPUs: Which NVIDIA Server GPU Wins for Your AI Workload?

Architects' Tech Alliance

Mar 18, 2024 · Industry Insights

Why Nvidia’s NVLink C2C Is Redefining GPU‑CPU Interconnects

The article provides an in‑depth technical analysis of Nvidia’s NVLink C2C interconnect, comparing its latency, bandwidth, power efficiency, density and cost against traditional SerDes solutions and examining its role in building SuperChip architectures with Grace CPUs and Hopper GPUs.

GPUNVLinkcost analysis

0 likes · 12 min read

Why Nvidia’s NVLink C2C Is Redefining GPU‑CPU Interconnects

Architects' Tech Alliance

Mar 12, 2024 · Industry Insights

What’s Nvidia’s 2024‑2025 AI Chip Roadmap? A Deep Dive into GPUs, CPUs, and Interconnects

The article analyzes Nvidia’s 2023 investor‑meeting roadmap, revealing an annual GPU release cadence with H200, B100 and X100 chips, a unified "One Architecture" strategy spanning x86 and ARM, accelerated interconnects like NVLink‑C2C, and competitive pressures shaping its AI ecosystem.

AI hardwareGPU roadmapIndustry analysis

0 likes · 20 min read

What’s Nvidia’s 2024‑2025 AI Chip Roadmap? A Deep Dive into GPUs, CPUs, and Interconnects

Architects' Tech Alliance

Feb 29, 2024 · Industry Insights

Choosing the Right GPU Cluster Network: NVLink, InfiniBand, RoCE & DDC Explained

This article examines the key GPU/TPU cluster networking options—NVLink, InfiniBand, RoCE Ethernet, and emerging DDC full‑scheduling fabrics—detailing their latency, loss‑less transmission, congestion control, cost, power, and scalability considerations for large‑scale AI training deployments.

AI trainingDDC fabricGPU networking

0 likes · 18 min read

Choosing the Right GPU Cluster Network: NVLink, InfiniBand, RoCE & DDC Explained

Architects' Tech Alliance

Jan 26, 2024 · Industry Insights

How AI Servers Connect: Inside PCIe, NVLink, and Memory Interfaces

The article provides an in‑depth industry analysis of AI server hardware, covering shipment forecasts, NVIDIA DGX H100 specifications, the role of PCIe switches and retimers, the evolution of NVLink/NVSwitch, and the market dynamics of DDR4/DDR5 memory interface chips.

AI serversHardwareIndustry analysis

0 likes · 12 min read

How AI Servers Connect: Inside PCIe, NVLink, and Memory Interfaces

Architects' Tech Alliance

Dec 24, 2023 · Artificial Intelligence

Overview of Popular GPU/TPU Cluster Networking Technologies for LLM Training

This article examines the main GPU/TPU cluster networking options—including NVLink, InfiniBand, RoCE Ethernet Fabric, and DDC full‑schedule networks—explaining their latency, loss‑less transmission, congestion control, cost, scalability, and suitability for large‑scale LLM training workloads.

GPU networkingInfiniBandLLM training

0 likes · 18 min read

Overview of Popular GPU/TPU Cluster Networking Technologies for LLM Training

Architects' Tech Alliance

Aug 21, 2023 · Artificial Intelligence

AI Compute Landscape: GPU Architectures, Tensor Cores, NVLink, and Scaling Challenges

The article surveys the AI compute ecosystem, explaining why CPUs are unsuitable for AI workloads, how heterogeneous CPU‑plus‑accelerator designs dominate, and detailing the evolution of NVIDIA GPUs, Tensor Cores, memory technologies, and inter‑GPU networking that enable large‑scale model training.

AI computeGPU clusteringNVLink

0 likes · 11 min read

AI Compute Landscape: GPU Architectures, Tensor Cores, NVLink, and Scaling Challenges

Architects' Tech Alliance

Jan 30, 2023 · Fundamentals

NVIDIA Grace CPU Superchip: Architecture, Performance, and Key Features

The article provides a detailed overview of NVIDIA's Grace CPU Superchip, describing its Arm‑based architecture, NVLink‑C2C interconnect, scalable coherency fabric, high‑bandwidth LPDDR5X memory, extensive I/O options, and software ecosystem, highlighting its suitability for HPC and AI workloads.

AIArm NeoverseCPU

0 likes · 10 min read

NVIDIA Grace CPU Superchip: Architecture, Performance, and Key Features

Architects' Tech Alliance

Dec 30, 2020 · Artificial Intelligence

Understanding GPUs, AI Accelerators, and Market Trends

The article explains GPU evolution, its integration with CPUs, interconnect technologies like PCIe and NVLink, market shares of NVIDIA, AMD and Intel, AI accelerator types (GPU, FPGA, ASIC), and the roles of training and inference in cloud AI, while also promoting a paid 182‑page PPT resource.

AI acceleratorGPUHPC

0 likes · 7 min read

Understanding GPUs, AI Accelerators, and Market Trends

Architects' Tech Alliance

Oct 28, 2020 · Artificial Intelligence

Understanding NVIDIA NVLink: Architecture, Features, and Applications

The article introduces NVIDIA’s third‑generation NVLink technology, detailing its high‑bandwidth GPU‑GPU and GPU‑CPU interconnect, key architectural breakthroughs such as the Ampere‑based A100 GPU, multi‑instance GPU, and NVSwitch, and discusses its impact on AI, HPC, and graphics workloads.

GPU interconnectHigh-performance computingNVLink

0 likes · 7 min read

Understanding NVIDIA NVLink: Architecture, Features, and Applications

Architects' Tech Alliance

Feb 2, 2019 · Artificial Intelligence

An Overview of NVIDIA NVLink: Architecture, Topology, and Performance

This article explains NVIDIA's NVLink interconnect technology, covering its history, protocol layers, bandwidth advantages over PCIe, topologies such as the HGX-1/DGX-1 mesh, the NVSwitch extension, and performance gains for deep‑learning and high‑performance computing workloads.

AI accelerationGPU interconnectNVLink

0 likes · 7 min read

An Overview of NVIDIA NVLink: Architecture, Topology, and Performance

Architects' Tech Alliance

Feb 1, 2019 · Industry Insights

How GPUDirect P2P Boosts Multi‑GPU Performance and What Limits It in Virtualized Environments

This article explains the background of GPU communication, details NVIDIA's GPUDirect and its Peer‑to‑Peer features, discusses virtualization challenges, and presents performance measurements on an Alibaba Cloud GN5 instance showing latency reduction and near‑linear scaling for deep‑learning workloads.

Deep LearningGPU communicationGPUDirect

0 likes · 6 min read

How GPUDirect P2P Boosts Multi‑GPU Performance and What Limits It in Virtualized Environments