Oct 8, 2025 · Artificial Intelligence

Why Reinforcement Learning Unlocks Hierarchical Reasoning in LLMs: The HICRA Breakthrough

The article explains how reinforcement learning induces a hierarchical learning dynamic in large language models, introduces the HICRA training paradigm that concentrates gradient updates on planning tokens, and shows through extensive text and multimodal benchmarks that this approach consistently yields earlier Aha moments and superior reasoning performance.

HICRAHierarchical ReasoningMultimodal Reasoning

0 likes · 10 min read

Why Reinforcement Learning Unlocks Hierarchical Reasoning in LLMs: The HICRA Breakthrough