Tagged articles
2 articles
Page 1 of 1
Tencent Cloud Developer
Tencent Cloud Developer
Sep 1, 2021 · Artificial Intelligence

Why Distributed Machine Learning Accelerates AI Training at Scale

This article reviews how distributed machine learning tackles massive data and compute challenges by partitioning models and data across workers, optimizing communication with primitives, parameter servers, and Ring AllReduce, reducing IO overhead, and applying advanced optimizers such as LARS and LAMB to achieve faster, scalable training.

LAMB optimizerLARS optimizerParameter Server
0 likes · 31 min read
Why Distributed Machine Learning Accelerates AI Training at Scale
Tencent Architect
Tencent Architect
Jul 30, 2018 · Artificial Intelligence

Four‑Minute ImageNet Training: Tencent’s AI Platform Sets a New World Record

Tencent’s intelligent machine‑learning platform achieved a world‑record by training AlexNet in 4 minutes and ResNet‑50 in 6.6 minutes on ImageNet, using large batch sizes, mixed‑precision, LARS optimization, hierarchical synchronization, gradient fusion, and pipeline I/O techniques to overcome accuracy and scalability challenges.

AI accelerationDeep LearningImageNet
0 likes · 24 min read
Four‑Minute ImageNet Training: Tencent’s AI Platform Sets a New World Record