Tagged articles
2 articles
Page 1 of 1
Data Party THU
Data Party THU
Sep 20, 2025 · Artificial Intelligence

How DeepSeek Trained a $30M LLM for Just $29.4K – Inside the R1 Model

The article reports that DeepSeek’s R1 large language model, detailed in a peer‑reviewed Nature paper, was built with roughly $300 k in total cost—about $29.4 k for training—using Nvidia H800 chips and novel pure reinforcement‑learning techniques, achieving competitive performance while remaining open‑source.

DeepSeekNvidia H800Peer Review
0 likes · 9 min read
How DeepSeek Trained a $30M LLM for Just $29.4K – Inside the R1 Model
Tencent Cloud Developer
Tencent Cloud Developer
Apr 14, 2023 · Artificial Intelligence

Tencent Cloud's Next-Generation HCC High-Performance Computing Cluster for Large Model Training

Tencent Cloud's new HCC high‑performance computing cluster triples previous generation performance with 3.2 TB/s server bandwidth, Xingsha servers and NVIDIA H800 GPUs delivering up to 1979 TFlops, while its Xingmai 3.2 T ETH RDMA network, TB‑level storage via COS + GooseFS, and multi‑form access (bare metal, cloud servers, containers, functions) enable efficient large‑model training.

AI computingGPU clusterHigh‑performance computing
0 likes · 9 min read
Tencent Cloud's Next-Generation HCC High-Performance Computing Cluster for Large Model Training