Tagged articles
36 articles
Page 1 of 1
Architects' Tech Alliance
Architects' Tech Alliance
May 14, 2026 · Artificial Intelligence

Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design

The article reviews the US‑approved export of Nvidia's DGX H200, the lack of deliveries, Jensen Huang’s surprise China trip that may speed approvals, and then provides a detailed technical breakdown of the DGX H200 cluster’s compute and storage networking, topology, optical link choices, and cable count estimates.

AI InfrastructureDGX H200Data Center Networking
0 likes · 8 min read
Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design
Architects' Tech Alliance
Architects' Tech Alliance
Jul 8, 2025 · Fundamentals

Why Modern Data Center Switches Are the Backbone of AI Scaling

This article explains how data‑center switches are classified, the key components and performance metrics of Ethernet switch chips, market growth trends, the shift from OEO to full‑optical OCS designs, and how RDMA technologies like InfiniBand and RoCEv2 enable the low‑latency networking essential for large‑scale AI training.

AI accelerationData Center NetworkingRDMA
0 likes · 12 min read
Why Modern Data Center Switches Are the Backbone of AI Scaling
Architects' Tech Alliance
Architects' Tech Alliance
Jun 10, 2025 · Fundamentals

Why RDMA Is Revolutionizing High‑Performance Computing and AI

This article explores how Remote Direct Memory Access (RDMA) technology transforms high‑performance computing, artificial intelligence, and cloud storage by eliminating data copies, bypassing the kernel, and offloading protocols to hardware, while reviewing key metrics, product ecosystems, real‑world use cases, challenges, and future trends.

DPUData Center NetworkingHigh‑performance computing
0 likes · 11 min read
Why RDMA Is Revolutionizing High‑Performance Computing and AI
Architects' Tech Alliance
Architects' Tech Alliance
May 5, 2025 · Industry Insights

Why Optical Interconnects Are Overtaking Copper in Data Centers

As AI, cloud, and 5G drive exponential data growth, traditional copper interconnects hit physical limits, prompting a shift to optical solutions—especially linear driver optics and UCIe standards—that promise higher bandwidth, longer reach, and better energy efficiency despite latency and power trade‑offs.

Co-Packaged OpticsData Center NetworkingHigh-speed interconnect
0 likes · 14 min read
Why Optical Interconnects Are Overtaking Copper in Data Centers
ByteDance SYS Tech
ByteDance SYS Tech
Mar 7, 2025 · Fundamentals

How NUMA‑Aware MPTCP Flow Selection Boosts Throughput and Cuts Latency

At Netdev 0x19, ByteDance's STE team presented two talks—one on a NUMA‑locality‑aware MPTCP flow‑selection strategy that can raise throughput by up to 30% and lower tail latency by 6%, and another on a DPDK‑based user‑space MPTCP stack that reduces latency by nearly 10% and more than doubles throughput—showcasing practical performance gains for data‑center networking.

DPDKData Center NetworkingMPTCP
0 likes · 8 min read
How NUMA‑Aware MPTCP Flow Selection Boosts Throughput and Cuts Latency
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jan 20, 2025 · Cloud Computing

2024 Alibaba Cloud Infrastructure Network Team: AI‑Scale Network Innovations, Academic Achievements, Open‑Source Contributions and Industry Outreach

The 2024 report of Alibaba Cloud's Infrastructure Network team details AI‑driven network breakthroughs, high‑performance protocol stacks, large‑scale monitoring systems, numerous top‑conference paper acceptances, open‑source ecosystem initiatives, and extensive industry outreach, highlighting the evolving AI infra landscape.

AI InfrastructureConference PapersData Center Networking
0 likes · 19 min read
2024 Alibaba Cloud Infrastructure Network Team: AI‑Scale Network Innovations, Academic Achievements, Open‑Source Contributions and Industry Outreach
Architects' Tech Alliance
Architects' Tech Alliance
Jan 18, 2025 · Industry Insights

Why Co‑Packaged Optics Are Redefining Data Center Networks

The article analyzes how Co‑Packaged Optics (CPO) and silicon photonics address exploding data‑center bandwidth demands, reduce power consumption, and enable AI‑driven workloads, while outlining industry roadmaps, major vendor contributions, and future technical challenges.

AI workloadsCo-Packaged OpticsData Center Networking
0 likes · 14 min read
Why Co‑Packaged Optics Are Redefining Data Center Networks
Architects' Tech Alliance
Architects' Tech Alliance
Dec 12, 2024 · Industry Insights

Why White‑Box Switches Are Redefining Data Center Networks

The article analyzes white‑box switches and optical circuit switching, detailing their hardware‑software decoupling, market share growth, benefits such as lower cost and power consumption, challenges like traffic prediction, and how AI and open‑source solutions like SONiC are driving their adoption in modern data centers.

AI traffic predictionData Center NetworkingSONiC
0 likes · 8 min read
Why White‑Box Switches Are Redefining Data Center Networks
Architects' Tech Alliance
Architects' Tech Alliance
Nov 29, 2024 · Industry Insights

How AI Workloads Are Driving the Rise of All‑Optical Switches

The article examines the shift from optical‑to‑electrical‑to‑optical (OEO) to fully optical (OOO) switching, highlighting Lightmatter's Passage technology and Google's large‑scale OCS deployment as key responses to growing AI compute demands in data‑center networks.

AI hardwareData Center NetworkingGoogle
0 likes · 7 min read
How AI Workloads Are Driving the Rise of All‑Optical Switches
Architects' Tech Alliance
Architects' Tech Alliance
Nov 7, 2024 · Industry Insights

Why RDMA, InfiniBand, and RoCE Are Redefining High‑Performance Data Center Networks

This article examines the evolution from the OSI and TCP/IP models to RDMA‑based technologies, compares traditional three‑tier and leaf‑spine architectures, analyzes NVIDIA SuperPOD designs, and evaluates Ethernet, InfiniBand, and RoCE switches to guide high‑throughput, low‑latency data‑center networking decisions.

Data Center NetworkingHigh‑performance computingInfiniBand
0 likes · 13 min read
Why RDMA, InfiniBand, and RoCE Are Redefining High‑Performance Data Center Networks
Architects' Tech Alliance
Architects' Tech Alliance
Oct 11, 2024 · Industry Insights

Why Common Network Misconceptions Hurt AI Performance and How to Fix Them

The article explains how prevalent misunderstandings in data‑center network design—such as altering end‑to‑end link speeds, overlooking switch radix, and choosing inappropriate buffering architectures—can increase latency and reduce AI workload efficiency, and it outlines the benefits of InfiniBand, cut‑through switching, scalable radix, and resilient AI‑cloud management solutions.

AIBuffer ArchitectureCut-through Switching
0 likes · 9 min read
Why Common Network Misconceptions Hurt AI Performance and How to Fix Them
Open Source Linux
Open Source Linux
Jun 5, 2024 · Operations

Unraveling Data Center Congestion: Incast, ECN, and PFC Explained

This article examines why data‑center networks experience congestion, detailing many‑to‑one and all‑to‑all traffic patterns, the role of incast, and how mechanisms such as ECN and PFC can be tuned to achieve loss‑free, low‑latency communication.

CLOSData Center NetworkingECN
0 likes · 10 min read
Unraveling Data Center Congestion: Incast, ECN, and PFC Explained
Architects' Tech Alliance
Architects' Tech Alliance
Jun 1, 2024 · Industry Insights

Why Do Data Center Networks Congest? Unpacking Many‑to‑One and All‑to‑All Incast Scenarios

The article analyzes how CLOS spine‑leaf data‑center networks encounter congestion under many‑to‑one and all‑to‑all traffic patterns, explains the limitations of simply enlarging buffers, and details how ECN and PFC mechanisms can be tuned to achieve loss‑less, low‑latency operation.

CLOSData Center NetworkingECN
0 likes · 12 min read
Why Do Data Center Networks Congest? Unpacking Many‑to‑One and All‑to‑All Incast Scenarios
Architects' Tech Alliance
Architects' Tech Alliance
May 7, 2024 · Operations

Why ECMP Struggles in AI‑Driven Data Centers and Better Load‑Balancing Alternatives

As AI workloads push intelligent compute power growth beyond 50% CAGR, data‑center networks face massive parallel paths, making traditional ECMP load‑balancing insufficient and causing severe congestion, while newer granular schemes such as packet‑spraying, flowlet, and cell‑based balancing offer higher bandwidth utilization and fairness.

AI workloadsData Center NetworkingECMP
0 likes · 17 min read
Why ECMP Struggles in AI‑Driven Data Centers and Better Load‑Balancing Alternatives
Linux Code Review Hub
Linux Code Review Hub
Apr 7, 2024 · Industry Insights

A Decade of RDMA: Lessons Learned from Protocol Evolution

The article reviews ten years of RDMA development, tracing its origins, the rise and pitfalls of RoCEv1/v2, alternative approaches like iWARP and Cisco usNIC, and recent modernizations such as AWS SRD, Google Falcon and UltraEthernet, highlighting why protocol design choices have repeatedly stalled industry progress.

AI AcceleratorsData Center NetworkingRDMA
0 likes · 27 min read
A Decade of RDMA: Lessons Learned from Protocol Evolution
Linux Code Review Hub
Linux Code Review Hub
Feb 26, 2024 · Fundamentals

Understanding Ethernet Flow Control and Congestion Management (Part 1)

This article explains Ethernet flow‑control mechanisms (LLFC and PFC), how pause frames and their quanta are calculated, the role of pause and resume thresholds (XOFF/XON), headroom and footroom concepts, buffer‑queue management, and provides Cisco Nexus configuration examples for lossless storage networks.

Cisco NexusCongestion ManagementData Center Networking
0 likes · 19 min read
Understanding Ethernet Flow Control and Congestion Management (Part 1)
Architects' Tech Alliance
Architects' Tech Alliance
Dec 6, 2023 · Artificial Intelligence

The Relationship Between Switches, Network Protocols, and AI in Modern Data Centers

This article explains how network protocols and switch architectures—including OSI layers, TCP/IP, RDMA, InfiniBand, RoCE, and leaf‑spine designs—support high‑throughput, low‑latency AI and HPC workloads, compares Ethernet and InfiniBand markets, and examines NVIDIA’s Spectrum/X and SuperPOD solutions.

AIData Center NetworkingInfiniBand
0 likes · 11 min read
The Relationship Between Switches, Network Protocols, and AI in Modern Data Centers
Open Source Linux
Open Source Linux
Nov 14, 2023 · Operations

Understanding Network Virtualization: VXLAN, NVGRE, STT, and SPBM Explained

This article explains how network virtualization decouples logical and physical networks, introduces Underlay and Overlay architectures, and compares four major overlay protocols—VXLAN, NVGRE, STT, and SPBM—highlighting their mechanisms and benefits for modern data‑center design.

Data Center NetworkingNVGRENetwork Virtualization
0 likes · 10 min read
Understanding Network Virtualization: VXLAN, NVGRE, STT, and SPBM Explained
Open Source Linux
Open Source Linux
Dec 12, 2022 · Cloud Computing

Why VXLAN Is the Key to Scalable Data Center Networks

This article explains how VXLAN overcomes traditional data‑center network limits on VM scale, isolation, and migration by using MAC‑in‑UDP encapsulation, a 24‑bit VNI, and BGP EVPN control plane, and shows its practical deployment in cloud‑campus environments with gateways, VTEP, NVE, and VBDIF.

BGP EVPNCloud CampusData Center Networking
0 likes · 15 min read
Why VXLAN Is the Key to Scalable Data Center Networks
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Nov 2, 2022 · Operations

Network Performance Anomaly Detection with In‑band Telemetry and High‑Performance Congestion Control (HPCC++) at the 2022 OCP Global Summit

At the 2022 OCP Global Summit in San Jose, Alibaba and Broadcom presented two technical talks covering in‑band telemetry‑based network performance anomaly detection and the HPCC++ congestion‑control algorithm, highlighting deployment challenges, resource trade‑offs, and real‑world data‑center use cases.

Data Center NetworkingHPCCOCP Summit
0 likes · 6 min read
Network Performance Anomaly Detection with In‑band Telemetry and High‑Performance Congestion Control (HPCC++) at the 2022 OCP Global Summit
Open Source Linux
Open Source Linux
Oct 10, 2022 · Fundamentals

Why Copper Cables Still Matter in Modern Data Centers

Despite the rapid rise of fiber optics in data centers, copper cables remain essential for voice transmission, power delivery, and short‑range networking, offering easier installation, lower cost, and compatibility with legacy systems, making them unlikely to be fully replaced.

Data Center Networkingcable typescopper cable
0 likes · 9 min read
Why Copper Cables Still Matter in Modern Data Centers
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jun 7, 2022 · Industry Insights

Why Alibaba Cloud Ranks Among the Top 10 Global Network Research Institutions

The 2022 AI‑2000 ranking highlights Alibaba Cloud as one of the ten most influential network research institutions worldwide, detailing its extensive publication record, breakthrough low‑latency RDMA technologies, NFC distance expansion, and the XLINK QUIC protocol that collectively reshape data‑center and wide‑area networking.

AI 2000Alibaba CloudData Center Networking
0 likes · 4 min read
Why Alibaba Cloud Ranks Among the Top 10 Global Network Research Institutions
Architects' Tech Alliance
Architects' Tech Alliance
Apr 23, 2022 · Cloud Computing

Understanding VXLAN: Architecture, Benefits, and Comparison with Other Overlay Technologies

This article provides a comprehensive overview of VXLAN, explaining its purpose, architecture, frame format, VTEP implementations, and advantages over VLAN, Q‑in‑Q, MPLS, and other overlay protocols, while also discussing control‑plane options such as BGP EVPN and SDN automation for modern data‑center networks.

BGP EVPNDPUData Center Networking
0 likes · 14 min read
Understanding VXLAN: Architecture, Benefits, and Comparison with Other Overlay Technologies
Architects' Tech Alliance
Architects' Tech Alliance
Oct 24, 2021 · Cloud Computing

Why VXLAN Is the Key to Scaling Modern Data Centers

This article explains how VXLAN technology overcomes traditional data‑center network constraints—such as limited MAC tables, VLAN isolation, and VM migration scope—by using MAC‑in‑UDP encapsulation, VNI identifiers, and BGP EVPN control, and demonstrates its practical deployment in a CloudCampus solution.

BGP EVPNCloudCampusData Center Networking
0 likes · 16 min read
Why VXLAN Is the Key to Scaling Modern Data Centers
Open Source Linux
Open Source Linux
Jul 12, 2021 · Fundamentals

Why Do Modern Data Centers Need a Large Second‑Layer Network?

Data centers adopt large second‑layer networks to overcome the limitations of traditional two‑ and three‑layer architectures, enabling seamless virtual machine migration across hosts without IP changes, improving utilization, reducing downtime, and supporting the scalability demands of server virtualization and cloud environments.

Data Center Networkingsecond layer networkserver virtualization
0 likes · 8 min read
Why Do Modern Data Centers Need a Large Second‑Layer Network?
Architects' Tech Alliance
Architects' Tech Alliance
Jan 10, 2021 · Industry Insights

Why RoCE Is Revolutionizing Data Center Networking: A Deep Dive into RDMA over Ethernet

This article explains the fundamentals of RDMA and RoCE, compares RoCE v1 and v2, outlines deployment steps, highlights performance benefits such as low CPU usage and zero‑copy, and answers common questions about its differences from iWARP and InfiniBand, helping data‑center engineers evaluate the technology.

Data Center NetworkingHigh BandwidthLow latency
0 likes · 8 min read
Why RoCE Is Revolutionizing Data Center Networking: A Deep Dive into RDMA over Ethernet
UCloud Tech
UCloud Tech
Jan 16, 2020 · Operations

How to Build a Low‑Latency, Lossless RoCE Network for High‑Performance Data Centers

This article explains how to design a low‑overhead, high‑performance lossless RoCE network for data centers, covering RDMA basics, mainstream network options, QoS, lossless and congestion‑control designs, buffer management, deadlock analysis, and practical tuning to achieve sub‑100 µs latency and near‑full bandwidth utilization.

Data Center NetworkingLossless EthernetQoS
0 likes · 21 min read
How to Build a Low‑Latency, Lossless RoCE Network for High‑Performance Data Centers
Efficient Ops
Efficient Ops
Jun 19, 2019 · Operations

How to Build a Pure Three‑Tier Server Access Network Without Overlay

This article examines the evolution of data‑center server access networks, explains why traditional large Layer‑2 designs are problematic at scale, and presents a pure three‑tier underlay solution that uses host routing, ECMP, and ARP proxy to achieve seamless KVM communication without overlay overhead.

ARP proxyBGPData Center Networking
0 likes · 21 min read
How to Build a Pure Three‑Tier Server Access Network Without Overlay
Architects' Tech Alliance
Architects' Tech Alliance
Jun 8, 2017 · Cloud Computing

Mellanox InfiniBand Technology Overview: Architecture, Protocol Stack, and Product Portfolio

This article provides a comprehensive overview of Mellanox's InfiniBand solutions, covering the company's background, network architecture, routing algorithms, Fat‑Tree topology, the OFED software stack, management tools, MPI support, adapters, switches, routers, cables, and related products for high‑performance computing and cloud data centers.

Data Center NetworkingFat-TreeHigh-Performance Computing
0 likes · 21 min read
Mellanox InfiniBand Technology Overview: Architecture, Protocol Stack, and Product Portfolio
Efficient Ops
Efficient Ops
Sep 1, 2016 · Operations

Why Network Ops Remains the Unsung Hero: Pain Points and the Future of SDN

The article examines long‑standing pain points in network operations—from industry bias and costly manual tasks to data‑center networking and interconnect challenges—while exploring how SDN and modern automation can reshape the role of network engineers for more resilient, business‑driven infrastructures.

DCIDCNData Center Networking
0 likes · 17 min read
Why Network Ops Remains the Unsung Hero: Pain Points and the Future of SDN