Why Hierarchical Reasoning Model (HRM) Beats Large Models on ARC with Few Samples

The Hierarchical Reasoning Model (HRM) draws on brain-inspired hierarchical processing to achieve state‑of‑the‑art performance on ARC, Sudoku‑Extreme, and Maze‑Hard tasks using under a thousand training examples, while exposing generalization limits on private test sets.

Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Why Hierarchical Reasoning Model (HRM) Beats Large Models on ARC with Few Samples

HRM (Hierarchical Reasoning Model) is motivated by neuroscience findings that the brain processes information in a layered, multi‑timescale fashion, with fast low‑level sensory areas and slower high‑level prefrontal planning regions interacting through feed‑forward and feedback loops.

Biological Foundations

Since Mountcastle’s 1978 proposal of a unified cortical organization, research has shown that cortical hierarchies recursively integrate fine‑grained features and abstract goals. This mirrors the "System 1" fast intuition and "System 2" slow deliberation in psychology, as well as predictive‑coding models (Rao & Ballard) and multi‑timescale neural dynamics (Wang et al.).

HRM Architecture and Key Techniques

HRM implements two recursive modules:

H‑module (high‑level) : updates a global state once per outer iteration, handling abstract strategy and planning.

L‑module (low‑level) : performs many fast inner iterations under the fixed high‑level state until a local fixed point is reached, then feeds the result back to the H‑module.

The model also incorporates an adaptive computation time (ACT) controller that decides, via a learned halt/continue signal, whether to stop or continue iterating, emulating human fast‑slow thinking.

Technical Details

Fixed‑point iteration in the L‑module ensures convergence of latent variables by solving z = f(z, H, x) repeatedly.

One‑step gradient approximation uses the Implicit Function Theorem to back‑propagate through the fixed point with constant memory cost.

ACT introduces a halt controller that receives the current high‑level state and outputs a binary decision, trained with reinforcement‑learning‑style rewards balancing accuracy and compute.

Experimental Evaluation

HRM was tested on symbolic reasoning and constraint‑based benchmarks:

ARC‑AGI public set : 40.3% accuracy on ARC‑AGI‑1 (vs. commercial models ~34.5%) and ~5% on the harder ARC‑AGI‑2.

ARC‑AGI private holdout : performance dropped to 32% and 2%, indicating limited generalization.

Sudoku‑Extreme & Maze‑Hard (30×30) : near‑perfect scores, while Chain‑of‑Thought baselines performed near zero.

These results sparked community debate about whether HRM’s gains stem from its hierarchical architecture or from training tricks.

Implications and Limitations

HRM demonstrates that brain‑inspired hierarchical recursion and dynamic depth can yield strong performance on tasks requiring fast‑slow reasoning, but the sharp drop on private ARC data warns that such gains may not fully translate to unseen distributions. The model remains a promising step toward more general, explainable AI, yet further research is needed to improve robustness.

Source: 黄大年茶思屋科技网站

Symbolic ReasoningAdaptive Computation TimeARC Benchmarkbrain-inspired AIHierarchical Reasoning
Huawei Cloud Developer Alliance
Written by

Huawei Cloud Developer Alliance

The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.