How DeepSpeed-Chat Accelerates ChatGPT‑Style Model Training by 15×
Microsoft open‑sourced DeepSpeed‑Chat, a toolkit that streamlines the end‑to‑end training and inference of ChatGPT‑like large language models using RLHF, delivering up to fifteen‑fold speedups and dramatically lower costs, even on a single GPU.
On April 12 (local time), Microsoft announced the open‑source release of DeepSpeed‑Chat, a solution that makes it easy to train ChatGPT‑style large language models.
DeepSpeed‑Chat is built on the DeepSpeed deep‑learning optimization library and supports training, reinforcement learning from human feedback (RLHF), and accelerated inference. It can increase training speed by more than 15× while significantly reducing cost.
For example, a 13‑billion‑parameter model can be trained in just 1.25 hours, and a 130‑billion‑parameter model can be trained in under a day using the DeepSpeed‑HE engine.
DeepSpeed‑Chat integrates a three‑step RLHF pipeline:
Supervised fine‑tuning (SFT) with curated human responses.
Reward model fine‑tuning using a dataset of multiple human‑rated answers.
RLHF training with Proximal Policy Optimization (PPO) guided by the reward model.
Additional features such as exponential moving average (EMA) and mixed‑precision training improve model quality, with EMA often yielding better responses than standard final‑model checkpoints.
Core capabilities of DeepSpeed‑Chat
1. Simplified training and inference for ChatGPT‑type models: a single script handles all steps, supports Hugging Face pretrained models, and provides an easy‑to‑use inference API.
2. DeepSpeed‑RLHF module: reproduces the InstructGPT training workflow, ensuring SFT, reward‑model fine‑tuning, and RLHF are tightly coupled, and offers data abstraction and mixing for diverse data sources.
3. DeepSpeed‑HE engine: merges DeepSpeed’s training and inference engines, enabling seamless switching between training and inference during RLHF, leveraging tensor‑parallelism, high‑performance CUDA kernels, ZeRO‑ and LoRA‑based memory optimizations, and automatic memory management.
DeepSpeed‑HE delivers over 15× speedup compared to existing systems, making RLHF training fast, affordable, and scalable. On Azure, training an OPT‑13B model costs less than $300 and takes 9 hours; an OPT‑30B model costs under $600 and takes 18 hours.
The system scales to models with hundreds of billions of parameters, training a 130‑billion‑parameter model in 1.25 hours and a 1750‑billion‑parameter model in under a day.
Microsoft’s goal is to democratize RLHF training, allowing researchers with a single GPU to train models exceeding 130 billion parameters, bringing personalized ChatGPT‑like capabilities within reach.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
