Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Nov 1, 2025 · Artificial Intelligence

AutoCCL: Automatic NCCL Tuning to Boost Distributed Deep Learning Performance

AutoCCL analyzes NCCL’s six key performance parameters, uses coordinate‑descent and an online leader‑worker architecture to automatically adjust them during training, overcoming state‑space explosion and compute‑communication interference, and achieves 1.07‑1.32× faster iteration times on models such as Phi‑2, Llama‑3.1‑8B and VGG‑19.

AutoCCLCoordinate DescentDistributed Deep Learning
0 likes · 5 min read
AutoCCL: Automatic NCCL Tuning to Boost Distributed Deep Learning Performance