360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Dec 30, 2025 · Cloud Native

How HBox Boosts GPU Utilization with Multi‑Pool and NUMA‑Aware Scheduling

The HBox scheduling platform tackles large‑scale AI cluster challenges by introducing a three‑pool resource model, priority‑based preemptive scheduling, network‑topology and NUMA‑aware dispatch, and GPU virtualization techniques like MIG and vGPU, dramatically improving GPU utilization, SLA guarantees, and overall cluster efficiency.

AI clustersGPU SchedulingGPU virtualization
0 likes · 24 min read
How HBox Boosts GPU Utilization with Multi‑Pool and NUMA‑Aware Scheduling
Architects' Tech Alliance
Architects' Tech Alliance
Nov 9, 2025 · Artificial Intelligence

Why Optical Interconnects Are the Next Bottleneck‑Breaker for Massive AI Clusters

This article systematically examines the demand, technology stack, and industry landscape of large‑scale AI compute clusters, highlighting the limitations of traditional copper interconnects and presenting device‑level and chip‑level optical interconnect solutions—including OCS, pluggable modules, silicon photonics, VCSEL, and micro‑LED—while outlining current challenges and future directions.

AI clustersData CenterSilicon Photonics
0 likes · 15 min read
Why Optical Interconnects Are the Next Bottleneck‑Breaker for Massive AI Clusters
Architects' Tech Alliance
Architects' Tech Alliance
Mar 31, 2024 · Industry Insights

How Many Optical Modules Do A100, H100, and GH200 AI Clusters Really Need?

This article analyzes the evolving data‑center network architectures for large AI clusters, detailing leaf‑spine and Fat‑Tree designs, NVLink interconnects, and calculating the precise optical‑module requirements for NVIDIA A100, H100, and GH200 deployments, while also comparing industry examples from Meta, AWS, and Google.

AI clustersNVLinkNetwork Architecture
0 likes · 12 min read
How Many Optical Modules Do A100, H100, and GH200 AI Clusters Really Need?