Tagged articles
2 articles
Page 1 of 1
Wuming AI
Wuming AI
Apr 19, 2026 · Artificial Intelligence

Why Bigger LLMs Aren’t Smarter: Karpathy Blames Junk Training Data

Karpathy argues that the rapid growth of large language models is driven more by noisy, low‑quality training data than by a need for greater intelligence, urging a split between clean cognition cores and external memory to achieve smarter, more efficient AI.

AI model efficiencyKarpathyLLM scaling
0 likes · 5 min read
Why Bigger LLMs Aren’t Smarter: Karpathy Blames Junk Training Data
Software Engineering 3.0 Era
Software Engineering 3.0 Era
Feb 21, 2025 · Artificial Intelligence

How NSA and MoE Are Shaping the Future of Large‑Model Development

The article examines Native Sparse Attention (NSA) and Mixture‑of‑Experts (MoE) as complementary innovations that improve data quality, model architecture, and inference efficiency for large models, while also discussing their challenges and potential research directions.

Large ModelsMixture of ExpertsNative Sparse Attention
0 likes · 11 min read
How NSA and MoE Are Shaping the Future of Large‑Model Development