Tagged articles

Block Scaling

1 articles · Page 1 of 1
Machine Heart
Machine Heart
Jul 5, 2026 · Artificial Intelligence

Why Larger Blocks Hurt Diffusion Language Model Inference and How T* Solves It

The article analyzes the trade‑off in masked diffusion language models where larger generation blocks increase parallelism but degrade reasoning, and shows how the T* progressive block‑scaling method using trajectory‑aware reinforcement learning stabilizes training and boosts accuracy across block sizes, with up to 15 % gains on MATH500.

Block ScalingDiffusion Language ModelMATH500
0 likes · 8 min read
Why Larger Blocks Hurt Diffusion Language Model Inference and How T* Solves It