Is There Really a Unique Mechanism in LLMs? Rethinking Functional Anisotropy
A recent ICML 2026 paper disproves the long‑held assumption that each task in a large language model is supported by a single, unique circuit, showing through overlap‑aware sheaf repulsion that many structurally dissimilar, sparse sheafs can achieve identical performance across multiple benchmarks, and proposing a distributive dense circuit hypothesis to explain this non‑uniqueness.
Mechanistic interpretability has long assumed that a given task in a large language model (LLM) is realized by a single, almost unique internal circuit, motivating the field of circuit and sheaf discovery (CSD). The new ICML 2026 paper titled All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs challenges this premise.
Functional Anisotropy Hypothesis
The authors name the implicit assumption the Functional Anisotropy Hypothesis : each ability corresponds to one specialized, indispensable internal mechanism. This hypothesis underlies benchmarks such as Tracr‑based synthetic tests and the Mechanistic Interpretability Benchmark (MIB), which reward circuits that achieve high performance with few components.
Overlap‑Aware Sheaf Repulsion (OASR)
To expose alternative solutions, the paper introduces Overlap‑Aware Sheaf Repulsion (OASR) . Building on the DiscoGP framework, which models sheaf discovery as a differentiable edge‑selection problem using Gumbel‑Sigmoid logits and a straight‑through estimator, OASR adds an overlap penalty. After each discovered sheaf, the set of retained edges R is recorded, and subsequent discoveries are penalized for re‑using edges in R, forcing the optimizer to find equally performant but structurally distinct sheafs.
Empirical Findings on IOI and Other Tasks
Applying OASR to the indirect object identification (IOI) task yields two sheafs, A and B, both reaching 100% accuracy. However, their overlap is minimal: only 96 edges intersect while the union contains 2,351 edges, giving an IoU of 4.1%—close to random edge selection under DAG constraints. Layer‑wise analysis shows significant differences in edge distribution, ruling out mere re‑parameterisation.
The same pattern repeats on BLiMP sub‑tasks (AGA, ANA), various DNA‑variant benchmarks, and Docstring tasks: each task consistently produces two sheafs with comparable performance but IoU values between 4% and 11%.
Scaling Sheaf Discovery
When the authors repeat OASR 20 times per task, the cumulative union of edges grows steadily while the cumulative intersection shrinks dramatically, often falling below 1% IoU (IOI: 0.15%). Adding the overlap penalty further reduces the shared core without sacrificing sparsity or performance, indicating that discovering more sheafs does not converge to a common core.
Robustness Across Methods
To test whether the phenomenon is specific to DiscoGP + OASR, the authors repeat the analysis with three other mainstream circuit‑discovery methods: ACDC (heuristic edge‑deletion), EAP (first‑order gradient attribution), and Edge Pruning (gradient‑based pruning). Despite their differing philosophies, all three methods exhibit the same non‑uniqueness: ACDC’s results vary with attention‑head traversal order, EAP’s sheafs are sensitive to task‑irrelevant name changes, and EP mirrors DiscoGP’s behaviour when its loss is swapped for a task‑specific objective.
Investigating a Potential Core
The authors then probe whether a minimal shared core exists. By intersecting many independently discovered IOI sheafs, they find that even a 11‑edge intersection retains >90% accuracy. Exhaustive search from this intersection isolates a three‑edge sheaf (e₁: embedding → layer‑0 MLP; e₂: layer‑0 MLP → layer‑10 head‑7 V node; e₃: head‑7 → final hidden representation) that achieves 86.7% accuracy on zero‑ablation. Removing these three edges from any discovered sheaf drops accuracy to 52.3%.
However, when the IOI task is split into its ABBA and BABA sub‑templates and the three‑edge constraint is enforced, the model still discovers sparse, high‑performing sheafs, demonstrating that the apparent indispensability of the three edges stems from treating IOI as an aggregated task.
Distributive Dense Circuit Hypothesis
To explain the pervasive non‑uniqueness, the paper proposes the Distributive Dense Circuit Hypothesis . Assuming mild local linearity, each edge contributes a signature vector sₑ. The read‑out of a circuit approximates the sum of its edge signatures plus a bounded residual. Because the number of possible edge subsets C(|E|, s) vastly exceeds the finite number of quantised read‑out “buckets”, the pigeon‑hole principle guarantees many distinct edge sets map to the same bucket, i.e., produce nearly identical predictions. A packing argument then shows that among colliding sets one can find pairs with extremely low overlap, while preserving the prediction margin.
Consequently, for any task there exist multiple low‑overlap circuits that achieve the same performance, making non‑uniqueness a natural consequence of high‑dimensional, superposed representations.
Implications for Interpretability
The authors stress that their findings do not invalidate CSD; discovered mechanisms remain causal and meaningful. What changes is the interpretive stance: a single circuit should no longer be taken as the definitive explanation for a task. Instead, interpretability must accommodate a space of functionally equivalent, partially redundant dense mechanisms.
Further experimental details, statistical analyses, node‑level overlap metrics, and full proofs are available in the original paper and its appendix (https://openreview.net/forum?id=3uC9teMlUt, code: https://github.com/TonyXiChen/OASR).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
