How Hyper‑Converged Data Center Networks Boost Compute Power with Ethernet
The article analyzes how hyper‑converged data center networks use lossless Ethernet to unify general compute, high‑performance computing and storage, enabling lifecycle automation, intelligent operation, and significant compute capacity gains without expanding server count.
Abstract
Data centers are evolving into compute centers that provide the digital foundation for countless industries and extract commercial value from massive data. Hyper‑converged data center networks employ lossless Ethernet to integrate general compute, high‑performance computing (HPC), and storage on a single Ethernet fabric, achieving full‑lifecycle automation, intelligent operation, and a notable increase in overall compute capability without adding more servers.
1. The Smart Era Drives Data Centers Toward Compute Centers
In the era of pervasive sensing, connectivity, and intelligence, technologies such as IoT, big data, 5G, and AI generate continuous innovation. China’s 14th Five‑Year Plan emphasizes accelerating digital development and building a digital China. As the information cornerstone of a digital society, a data center handles data storage, analysis, and computation, turning raw data into valuable compute power for applications like facial recognition, autonomous vehicles, and smart factories. The Open Data Center Committee (ODCC) defines data‑center compute power by four core elements: general compute, HPC, storage, and network capability. Enhancing network capability can markedly improve compute efficiency while keeping server count constant.
2. What Is a Hyper‑Converged Data Center Network?
A modern data center comprises three major resource zones:
General Compute Zone : Interfaces with external users, delivers application services, and heavily utilizes virtualization and containers to form a flexible resource pool. Its network is typically an Ethernet‑based application or front‑end network.
High‑Performance Computing Zone : Hosts dedicated high‑performance units (CPU, GPU) for AI training or scientific workloads. It usually runs on InfiniBand (IB) and avoids virtualization.
Storage Zone : Consists of dedicated storage servers for data persistence, backup, and retrieval, commonly connected via Fibre Channel (FC).
These zones rely on three heterogeneous networks—Ethernet for general compute, IB for HPC, and FC for storage—resulting in protocol fragmentation, operational difficulty, high cost, and limited lifecycle management. Converging these networks onto a unified Ethernet fabric is essential for scaling compute power.
3. Ethernet Advantages for the Compute Layer
Ethernet offers high openness, seamless cloud integration, scalability, multi‑tenant security, and the bandwidth needed for emerging large‑scale workloads. CPU/GPU scaling faces diminishing returns: moving from 128 to 256 cores yields only a 1.2× performance increase, while power consumption rises sharply. PCIe bandwidth limits also hinder high‑throughput HPC scenarios. The industry is shifting toward RoCE (RDMA over Converged Ethernet), which reduces intra‑server latency to around 1 µs and offloads CPU processing. Compared with TCP, RDMA eliminates the tens‑of‑microseconds stack delay that becomes a bottleneck in AI and other latency‑sensitive applications.
In HPC, two mainstream solutions carry RDMA traffic: dedicated InfiniBand networks and Ethernet‑based RoCE. InfiniBand’s proprietary protocol and closed ecosystem make large‑scale IP integration difficult and keep OPEX high. Ethernet‑based RoCE, especially when combined with NVMe‑over‑RoCE, is becoming the preferred approach for high‑performance data transfer.
4. Storage Layer: Moving to All‑Flash and NVMe‑over‑Fabric
New workloads demand massive storage I/O, prompting a shift from HDD to SSD, which can improve storage performance by up to 100×. NVMe introduces a high‑speed, low‑latency protocol that dramatically boosts internal storage throughput and reduces transfer delay. Traditional FC storage networks have become bandwidth and latency bottlenecks. NVMe‑over‑Fabric (NVMe‑oF) extends the NVMe protocol across the network, allowing direct server‑to‑storage communication and replacing FC/SCSI in SAN environments. NVMe‑oF can run over FC, TCP, or RoCE, with RoCE offering the best combination of bandwidth, low latency, and IP‑based openness.
5. Network Operations: Deployment and Management Upgrades
Current data‑center networks face three major operational challenges:
Management difficulty: multiple vendors and heterogeneous interfaces hinder unified control.
High error rate: complex workflows for new or changed services involve many teams, leading to low efficiency and frequent mistakes.
Slow fault localization: average mean‑time‑to‑repair is about 76 minutes, severely impacting service continuity.
Huawei’s hyper‑converged data‑center network addresses these issues by merging the three traditional fabrics into a single, open Ethernet fabric, providing automated lifecycle management, intelligent O&M, and seamless integration with cloud‑native workloads.
Conclusion
By leveraging lossless Ethernet, RoCE, and NVMe‑over‑Fabric, hyper‑converged data‑center networks deliver higher bandwidth, lower latency, and full IP‑based openness compared with legacy FC and IB solutions. This architecture enables data centers to meet the escalating compute demands of AI, HPC, and big‑data applications while simplifying operations and reducing total cost of ownership.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
