Tagged articles

large reasoning models

3 articles · Page 1 of 1

Jun 21, 2026 · Artificial Intelligence

Why Post‑Training Makes Large Reasoning Models Overconfident and How LED Restores Exploration

The paper reveals that reinforcement‑learning post‑training flattens the entropy of the final layer in large reasoning models, making higher sampling temperatures ineffective, and introduces Latent Exploration Decoding (LED) to recover exploration from intermediate layers, yielding consistent pass@k gains without extra training.

LED methodRL post‑trainingentropy collapse

0 likes · 13 min read

Why Post‑Training Makes Large Reasoning Models Overconfident and How LED Restores Exploration

Tencent Tech

Oct 27, 2025 · Artificial Intelligence

How SpecExit Cuts Large Reasoning Model Inference Time by Up to 2.5×

SpecExit combines early‑exit and speculative decoding to let large reasoning models detect when they have almost finished thinking, trimming redundant chain‑of‑thought steps, reducing over‑thinking by 72% and achieving up to 2.5× faster end‑to‑end inference without noticeable accuracy loss.

AIearly exitinference acceleration

0 likes · 6 min read

How SpecExit Cuts Large Reasoning Model Inference Time by Up to 2.5×

Architect

Jun 12, 2025 · Artificial Intelligence

Why Large Reasoning Models Collapse Under Complex Tasks: Insights from Apple’s Study

Apple’s research reveals that large reasoning models, despite sophisticated self‑reflection mechanisms, experience a complete performance collapse when problem complexity exceeds a threshold, highlighting fundamental limits in their ability to achieve generalized reasoning.

AI evaluationlarge reasoning modelsmodel limitations

0 likes · 7 min read