Tag

GPU communication

0 views collected around this technical thread.

Tencent Tech
Tencent Tech
May 7, 2025 · Artificial Intelligence

How Tencent’s DeepEP Doubles GPU Communication Speed on RoCE Networks

Tencent engineers highlighted a massive speedup in DeepSeek’s open‑source DeepEP communication framework, revealing how their TRMT‑based optimizations—dynamic multi‑QP topology awareness, IBGDA‑driven CPU‑bypass, and atomic signaling—boost RoCE network throughput up to 300% and add another 30% gain when applied to InfiniBand, effectively doubling GPU communication performance for large AI models.

AI model trainingDeepEPGPU communication
0 likes · 8 min read
How Tencent’s DeepEP Doubles GPU Communication Speed on RoCE Networks
Architects' Tech Alliance
Architects' Tech Alliance
Feb 3, 2019 · Fundamentals

Understanding GPUDirect RDMA: Principles, Implementation, and Performance

This article explains the background of GPU communication, introduces DMA and RDMA fundamentals, describes how GPUDirect RDMA enables direct GPU-to-GPU memory access across machines, and presents performance results showing reduced latency and increased bandwidth for distributed deep‑learning training.

GPU communicationGPUDirectHigh Performance Computing
0 likes · 7 min read
Understanding GPUDirect RDMA: Principles, Implementation, and Performance