Why Large Language Models Miss Simple Addition: Iso‑Raw‑Sum Trajectories Reveal the Geometry of Errors
Despite excelling at complex reasoning, LLMs often err on multi‑digit addition; probing shows correct answers reside in hidden states, and the authors reveal a structured geometric manifold—digit basins, carry fibers, and Iso‑Raw‑Sum trajectories—explaining how errors arise via noisy quantization at decision boundaries.
Background
Large language models (LLMs) excel at complex reasoning but frequently make errors on basic multi‑digit addition. Probing studies reveal that hidden states often still contain the correct answer, suggesting the mistake occurs during conversion from internal representation to output.
Probe Versatility
Lightweight probes were trained on the residual stream of Qwen3‑4B while it performed 10,000 three‑operand, 10‑digit addition problems. For each generation step the probes decoded six arithmetic variables: ground‑truth digit, model output digit, correctness flag, raw sum of the current column, input carry, and carry potential. All six signals could be extracted from the same hidden state, demonstrating that a single representation simultaneously encodes multiple arithmetic facts.
Iso‑Raw‑Sum Trajectory (IRST)
UMAP was applied to the final‑layer hidden states and digit unembedding vectors were used as anchors for digits 0–9. The visualization revealed a hierarchical geometric manifold:
Digit basins : hidden states cluster around ten basins corresponding to digits 0–9; proximity to a basin increases the likelihood of outputting that digit.
Carry fibers : within each basin, states further split according to the input carry (e.g., “no carry → 1”, “carry 1 → 2”, “carry 2 → 3”).
Some samples lie on continuous lines that cross adjacent digit basins. These lines constitute an Iso‑Raw‑Sum Trajectory (IRST) : a set of internal states that share the same raw sum (the sum of the current column’s digits) but differ in carry state. For a raw sum of 1 the three possible outcomes are:
Input carry 0 → output 1
Input carry 1 → output 2
Input carry 2 → output 3
Geometrically the three points lie on a single continuous trajectory that passes through the basins for 1, 2, and 3. The overall representation can be visualized as a terrain map: digit basins are valleys, IRSTs are ridgelines, and carry potential pushes the representation along these ridgelines toward a basin.
Noisy Quantization Model
The paper introduces a Noisy Quantization Model to explain why errors still occur. It defines Carry Potential (CP) as a continuous real‑valued signal that aggregates the “carry pressure” from all lower‑order digits to the right of the current position. Unlike the discrete input carry, CP is not an integer. The formal definition is shown in the following image:
When CP is far from an integer boundary (e.g., 1.50), small internal noise does not change the quantized carry. Near a boundary (e.g., 0.99 or 1.01), tiny perturbations can flip the quantized result, leading to the typical ±1 addition errors. This phenomenon is called geometric slippage : the hidden state drifts slightly along an IRST and crosses a basin boundary, causing the final token to fall into the wrong digit region.
Double‑Stream Consistency Check
Leveraging the internal signals, a runtime correction method called Double‑Stream Consistency Check was designed. From the same final‑layer hidden state two signals are decoded:
Local signal: the raw sum of the current column.
Global signal: the aggregated Carry Potential from the right‑hand context.
If the model’s predicted digit is consistent with both signals, the output is kept; otherwise the raw sum and the quantized Carry Potential are recombined to produce a corrected candidate. Experiments show this method achieves the highest token‑level accuracy among the original output and several baselines.
Conclusion
The study reframes LLM arithmetic as a geometric problem. Hidden states form a hierarchical manifold composed of digit basins, carry fibers, and IRSTs. Probes reveal not only the presence of arithmetic information but its geometric separability. Errors arise because the continuous representation is quantized near decision boundaries, leading to geometric slippage.
Paper: https://arxiv.org/abs/2606.03645
Code: https://github.com/RL-MIND/Shape-of-Addition
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
