Tagged articles
4 articles
Page 1 of 1
NewBeeNLP
NewBeeNLP
Mar 14, 2025 · Artificial Intelligence

How Open‑Sora 2.0 Achieves SOTA Video Generation with Only $200K Training Cost

Open‑Sora 2.0 is an open‑source 11B‑parameter video generation model that matches commercial SOTA performance while being trained on 224 GPUs for just $200,000, thanks to a 3D auto‑encoder, MMDiT architecture, aggressive data filtering, low‑resolution pre‑training, and highly optimized parallel training techniques.

AI modelMMDiTOpen-Sora
0 likes · 9 min read
How Open‑Sora 2.0 Achieves SOTA Video Generation with Only $200K Training Cost
Data Thinking Notes
Data Thinking Notes
Feb 11, 2025 · Artificial Intelligence

Why DeepSeek V3 and R1 Are Redefining LLM Efficiency and Power

This article analyzes DeepSeek's V3 and R1 large language models, detailing their low‑cost Mixture‑of‑Experts architecture, Multi‑Head Latent Attention redesign, distributed training optimizations, and reasoning‑focused innovations that together challenge traditional GPU/NPU compute demands.

AI inferenceDeepSeekMLA
0 likes · 15 min read
Why DeepSeek V3 and R1 Are Redefining LLM Efficiency and Power
DataFunTalk
DataFunTalk
Feb 20, 2023 · Artificial Intelligence

Low‑Cost Open‑Source Replication of ChatGPT Using Colossal‑AI

This article explains how researchers reproduced the full ChatGPT training pipeline—including supervised fine‑tuning, reward‑model training, and RLHF—using the open‑source Colossal‑AI system, dramatically reducing GPU memory and hardware requirements while providing ready‑to‑run code and performance benchmarks.

AI OptimizationChatGPTColossal-AI
0 likes · 10 min read
Low‑Cost Open‑Source Replication of ChatGPT Using Colossal‑AI