Tag

gradient checkpoint

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Apr 2, 2023 · Artificial Intelligence

Efficient Training of Large Models with the Open‑Source Distributed Framework Easy Parallel Library (EPL)

This article introduces the challenges of scaling deep‑learning model training, explains the design and components of the open‑source Easy Parallel Library (EPL) that unifies data, pipeline, and operator‑split parallelism, and demonstrates its best‑practice results on large‑scale classification, BERT‑large, and massive multimodal models.

EPLLarge-Scale TrainingParallelism
0 likes · 15 min read
Efficient Training of Large Models with the Open‑Source Distributed Framework Easy Parallel Library (EPL)