Why Skipping the Thinking Step Makes Large Language Models More Accurate
UC Berkeley researchers found that forcing large language models to skip explicit reasoning—using a “NoThinking” mode—can achieve comparable or better accuracy with significantly fewer tokens, especially under token budget constraints, across math, coding, and theorem‑proving benchmarks.
Study Overview
UC Berkeley researchers compare explicit reasoning ("Thinking") with a no‑thinking approach ("NoThinking") in large language models.
Models
DeepSeek‑R1‑Distill‑Qwen‑32B (distilled from Qwen‑32B)
Baseline: Qwen‑32B‑Instruct
Evaluated on 7B and 14B scale variants
Datasets
Mathematics: AIME 2024, AIME 2025, AMC 2023, OlympiadBench
Programming: LiveCodeBench v2
Theorem proving: MiniF2F, ProofNet
Evaluation without token budget
Pass@k and token usage were measured. NoThinking matches Thinking on theorem‑proving tasks while using only ~30 % of the tokens, and the performance gap on other tasks narrows as k increases.
Token‑budget constrained experiments
Two budgets were tested: low (< 3000 tokens) and high (~ 3500 tokens). Under the low budget NoThinking consistently outperforms Thinking. Under the high budget Thinking has a slight advantage at k = 1, but NoThinking surpasses it from k = 2 onward, while also reducing latency.
Parallel‑extension tests
For tasks with perfect validators (formal theorem proving), NoThinking reduces latency to 1/7 and token consumption to 1/4 without sacrificing accuracy. For tasks without validators (e.g., AMC 2023, OlympiadBench), NoThinking even exceeds full Thinking performance and cuts latency to 1/9.
Data‑contamination check
The experiments were reproduced on the newly released AIME 2025 dataset, confirming that the observed patterns are not caused by data leakage.
Key implications
The results suggest that explicit chain‑of‑thought prompting may be unnecessary for optimal reasoning efficiency; skipping the reasoning step can yield substantial token and latency savings while maintaining or improving accuracy.
References
Paper: https://arxiv.org/abs/2504.09858
Related: https://www.anthropic.com/research/reasoning-models-dont-say-think
Discussion on Hacker News: https://news.ycombinator.com/item?id=43572374
Code example
收
藏
,
分
享
、
在
看
,
给
个
三
连
击呗!Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
