Tagged articles
1 articles
Page 1 of 1
Machine Heart
Machine Heart
May 13, 2026 · Artificial Intelligence

Why Bigger Teachers Don’t Teach Better: Tsinghua’s On‑Policy Distillation Study

Recent research by Tsinghua and collaborators dissects On‑Policy Distillation for large language models, revealing that higher‑scoring teachers often fail to improve students unless their thinking patterns align, detailing token‑level overlap dynamics, failure cases, and two practical remedies to rescue ineffective distillation.

Model ScalingOn-Policy DistillationRL Post-Training
0 likes · 9 min read
Why Bigger Teachers Don’t Teach Better: Tsinghua’s On‑Policy Distillation Study