Industry Insights 10 min read

Why Rubin288’s Orthogonal CLOS Architecture Beats Traditional Designs

The article analyzes NVIDIA's Rubin288 high‑density GPU cabinet, comparing its orthogonal CLOS architecture with the older non‑orthogonal designs, and explains how the new layout improves reliability, bandwidth, scalability, and cooling for modern data‑center HPC deployments.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Why Rubin288’s Orthogonal CLOS Architecture Beats Traditional Designs

Introduction

Rubin288 is NVIDIA's next‑generation high‑density GPU rack that can accommodate up to 288 GPUs in a single cabinet, delivering four times the density of its predecessor NVL72.

Problems with Legacy Cable‑Tray Architecture

Previous designs used a cable‑tray (non‑orthogonal) architecture where compute and switch boards were parallel and connected via a backplane. This caused signal interference, limited bandwidth upgrades, and required complex PCB routing, leading to low manufacturing yield and reliability issues.

Image
Image

The non‑orthogonal CLOS architecture is typical in campus‑core networking equipment such as Huawei S12700E‑08, H3C S10500X series, and Ruijie RG‑N18010‑E series.

Orthogonal CLOS Architecture

In the orthogonal design, compute and switch boards are placed at 90° angles and connected directly via high‑speed orthogonal connectors, eliminating the backplane. This reduces signal attenuation, increases bandwidth, and allows seamless capacity scaling to hundreds of Tbps.

Image
Image

Hardware using this zero‑backplane approach includes CloudEngine 16800 series, H3C S12500X series, and Ruijie RG‑N18010‑XH.

Comparison with NVL72

NVL72 employed a non‑orthogonal layout with 18 compute nodes and 9 switch nodes per rack, each compute node holding four GPUs and each switch node containing two 28.8 TB NVLink chips. All GPUs were interconnected via a copper‑cable backplane, which suffered from reliability challenges and difficult maintenance.

Image
Image

These issues motivated the shift to an orthogonal architecture for Rubin288.

Rubin288 Orthogonal Design

Rubin288 replaces the backplane with direct copper connectors between compute and switch nodes, secured by lock mechanisms. Each compute node spans two standard rack widths, fitting eight Rubin GPUs and two CPUs.

Image
Image

Network cards are arranged such that ScaleOut NICs are not PCIe‑attached to compute boards, while FrontEnd NICs may be present in small numbers. The compute tray occupies 36U and holds all 288 GPUs; the switch tray uses next‑generation CX9/10 NICs.

Advantages of the Orthogonal Architecture

Reliability and Maintainability: Both compute and switch boards support N+1 redundancy and hot‑swap replacement, reducing downtime to a fraction of the system.

Network Optimization: A single‑layer switch topology avoids the complexity of multi‑layer designs, and placing ScaleOut NICs directly in the ScaleUP domain simplifies traffic flow.

Higher Yield and Simpler Cabling: Eliminating the backplane removes a major source of manufacturing defects, increasing yield by more than an order of magnitude.

Thermal and Power Considerations: The design demands megawatt‑scale power and liquid‑cooling solutions to handle the dense GPU deployment.

Conclusion

The orthogonal CLOS architecture of Rubin288 addresses the reliability, scalability, and thermal challenges of previous backplane‑based systems, representing a clear trend toward zero‑backplane, high‑density HPC solutions for future data‑center deployments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

architectureGPUNvidiaHPCDataCenterCLOS
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.