Tag

A100

0 views collected around this technical thread.

Tencent Technical Engineering
Tencent Technical Engineering
Mar 31, 2025 · Artificial Intelligence

Step-by-Step Guide to Local Training of DeepSeek R1 on Multi‑GPU A100 Systems

This step‑by‑step tutorial shows how to set up CUDA 12.4, install required packages, prepare a JSON dataset and custom reward, troubleshoot out‑of‑memory errors, and launch DeepSeek R1 training on an 8‑GPU A100 cluster using Accelerate, Deepspeed zero‑3 and vLLM configurations.

A100CUDADeepSeek
0 likes · 9 min read
Step-by-Step Guide to Local Training of DeepSeek R1 on Multi‑GPU A100 Systems