Machine Heart
Machine Heart
Apr 26, 2026 · Artificial Intelligence

Balanced Thinking: Boost LLM Accuracy by 10% While Cutting Inference Length 35%

The paper introduces ReBalance, a training‑free two‑stage inference control framework that uses model confidence signals to dynamically balance reasoning depth, achieving up to a 10‑point accuracy gain and a 35.4% reduction in token length across multiple LLM sizes and benchmarks.

Balanced ThinkingConfidence SteeringEfficient Inference
0 likes · 9 min read
Balanced Thinking: Boost LLM Accuracy by 10% While Cutting Inference Length 35%