Full Liquid‑Cooled Cold Plate Server Design and Performance Testing (2024)
This article presents a comprehensive reference design and performance evaluation of a 2U four‑node high‑density server employing full liquid‑cooled cold plates for CPUs, memory, storage, NICs, and power supplies, detailing system architecture, flow design, CFD validation, and future optimization directions.
The whitepaper "Full Liquid‑Cooled Cold Plate System Reference Design and Performance Test (2024)" aims to advance liquid‑cooling technology and ecosystem maturity for high‑density servers, extending cooling beyond CPUs and GPUs to memory, SSDs, OCP NICs, PSUs, PCIe, and optical modules.
1. Full Liquid‑Cooled Cold Plate Server Innovation The design is based on a 2U four‑node server (i24) with two Intel Xeon 5th‑gen CPUs, 16 DDR5 DIMMs, one PCIe expansion card, one OCP 3.0 NIC, and up to eight SSDs. Approximately 95% of heat is removed directly by cold plates, with the remaining 5% handled by a rear‑mounted water‑air heat exchanger, achieving near‑100% liquid‑heat capture.
Key innovations include blind‑plug water‑electric‑signal connectors for node‑disk zones, reduced tubing, serial‑connected cold‑plate interfaces to lower leak risk, a new memory‑cooling solution improving thermal performance and signal reliability, and OCP NIC/SSD cooling supporting >30 hot‑swaps.
The solution maximizes existing air‑cooling modules and mature cold‑plate manufacturing, avoiding custom parts and reducing cost, while extensive testing on low‑density aluminum plates confirms long‑term compatibility with cooling fluids.
2. System Composition and Piping Layout
2.1 Full Liquid‑Cooled Server Overview The 2U four‑node system comprises nodes, chassis, backplane, and SSD modules, with blind‑plug connections for water, power, and signals.
2.2 Single Node Details Each node includes a chassis, motherboard, CPU, memory modules, memory cold plate, CPU cold plate, I/O cold plate, power supply, and rear heat exchanger.
3. Flow Path Selection and Flow Rate Calculation
3.1 Flow Path Choice A serial flow path is used, moving coolant from low‑power to high‑power components, simplifying design.
3.2 Flow Rate Design The system targets a secondary loop temperature ≤ 65 °C and uses copper cold plates with PG25 fluid, requiring 1.3 LPM per node. CFD simulations at 51 °C inlet water and 1.3 LPM confirm all components stay within temperature limits with safety margin.
The analysis confirms the 1.3 LPM flow meets cooling goals economically, leveraging CDU‑provided coolant.
Future optimization should focus on improving energy efficiency, reducing initial cost, minimizing leak risk, enhancing maintainability, expanding low‑cost aluminum plate usage, and ensuring long‑term reliability through material compatibility and aging tests.
Related Links
InfiniBand, cannot shake Ethernet?
NVIDIA Quantum‑2 InfiniBand Platform Q&A
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.