Industry Insights 10 min read

How Alibaba Cloud’s Liquid‑Cooled Servers Are Redefining High‑Speed PCIe Design

Alibaba Cloud’s infrastructure team presented three award‑winning papers at DesignCon and ECTC, revealing how converged air‑and‑immersion cooling strategies, PCIe 6.0 signal‑integrity analysis, and long‑term reliability testing can dramatically lower costs, improve performance, and accelerate adoption of liquid‑cooled cloud servers.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
How Alibaba Cloud’s Liquid‑Cooled Servers Are Redefining High‑Speed PCIe Design

Best Practices for a Converged High‑Speed Channel Design for Cloud Servers in Both Air and Immersion Cooling (DesignCon 2023)

Research background: Data‑center operators are shifting from air‑cooled to immersion‑cooled architectures to reduce power usage and increase density. However, the dielectric constant of liquid is much higher than air, so components optimized for air (PCBs, connectors, cables) can suffer signal‑integrity (SI) degradation when used in liquid. A unified design that works well in both cooling regimes is needed for PCIe 5.0 and DDR5 links.

Research results: The paper presents a complete strategy and best‑practice guide for designing cloud‑server high‑speed channels that perform identically in air‑cooled and immersion‑cooled environments. It evaluates every critical component, proposes cost‑effective trade‑offs, and validates the approach with end‑to‑end PCIe 5.0 and DDR5 tests on actual Alibaba Cloud servers. The team has already delivered three generations of converged designs (PCIe 3.0/4.0/5.0 with DDR4/DDR5) that meet performance, cost, and reliability targets.

PCIe 6.0 (PAM4) Signal‑Integrity Challenges in Immersion‑Cooling Data Centers (DesignCon 2023)

Research background: Immersion cooling dramatically changes the impedance environment of high‑speed interconnects, causing excessive reflections and SI loss, especially for PAM4‑based PCIe 6.0, which is more sensitive to noise than NRZ‑based PCIe 5.0. Quantifying the SI gap between air‑cooled and liquid‑cooled deployments is essential for multi‑model server design.

Research results: The study provides the first quantitative SI analysis of 64 Gbps PAM4 PCIe 6.0 in both cooling media. Simulations and measurements show that merely reducing insertion loss is insufficient; distinct high‑speed channel designs are required for the two environments. The paper proposes three cost‑effective mitigation techniques:

Optimize immersion‑compatible SMT connector designs while preserving air‑cooling footprints, eliminating the SI gap.

Adopt directly‑welded connectors to further reduce SI disparity.

Develop more robust receiver ASICs capable of handling increased reflection noise (floating DFE/RXFFE).

Long‑Term Reliability Evaluation of Single‑Phase Immersion‑Cooled Servers with Electronic Fluorinated Liquid (73rd ECTC Conference)

Research background: Prolonged exposure of server components to fluorinated immersion liquids raises concerns about material compatibility, mechanical, chemical, electrical, and thermal degradation that are not observed in air‑cooled systems. Understanding these effects is crucial for the safe deployment of large‑scale immersion‑cooled data centers.

Research results: Using thousands of matched air‑cooled and immersion‑cooled server samples, the team measured failure rates and performance over three years of operation. Key findings include:

Overall failure rate of immersion‑cooled servers is ~50 % lower than that of air‑cooled equivalents.

All electrical and physical specifications remain within SPEC limits with no significant drift.

Performance metrics are comparable, with several indicators favoring immersion cooling.

The paper also notes Alibaba Cloud’s ongoing contributions to industry standards, white‑papers, and patents related to immersion cooling, underscoring the maturity of the technology for commercial data‑center use.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

liquid coolingPCIesignal integrityDesignConcloud serversECTChigh-speed design
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.