Tag

heterogeneous-compute

0 views collected around this technical thread.

DataFunTalk
DataFunTalk
Feb 17, 2022 · Cloud Native

ByteDance's Cloud‑Native Transformation of Its Machine Learning Platform

This article explains how ByteDance redesigned its machine‑learning platform using cloud‑native principles, detailing motivations, the shift from Yarn to Kubernetes, the implementation of PS‑Worker and AllReduce frameworks, unified operators, heterogeneous resource scheduling, elastic training, and future directions for large‑scale AI workloads.

Cloud NativeResource Schedulingelastic-training
0 likes · 15 min read
ByteDance's Cloud‑Native Transformation of Its Machine Learning Platform