Artificial Intelligence 6 min read

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

At EMNLP 2025, the BUPT NIRC team presented a paper that introduces the ARR metric to quantitatively separate latent reasoning from factual shortcuts in LLMs, using Logit Lens and Attention Knockout to reveal distinct internal pathways and shares their conference experience.

Network Intelligence Research Center (NIRC)

Jan 14, 2026

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

EMNLP (Conference on Empirical Methods in Natural Language Processing) is a top‑tier NLP conference organized by ACL's SIGDAT. The 2025 edition was held in Suzhou, China, receiving over 4,300 submissions and attracting thousands of researchers.

Paper Presentation

The first author presented the BUPT NIRC team's work titled Unveiling Internal Reasoning Modes in LLMs . The study focuses on multi‑hop questions such as “What is the capital of the country where the Eiffel Tower is located?” where LLMs often produce the correct answer but follow different internal paths.

Latent Reasoning : the model truly understands the logic, first identifying the intermediate entity (France) and then deriving the answer (Paris).

Factual Shortcuts : the model skips logical chaining and directly outputs the answer by exploiting strong co‑occurrence patterns (e.g., “Eiffel Tower” ↔ “Paris”) in the training data.

The goal is to distinguish these two modes and locate their essential differences within the model’s layer hierarchy.

ARR Metric

The authors propose the ARR (Attribute Retrieval Rate) metric to quantify how quickly a model extracts attribute information during reasoning.

Design: Using the Logit Lens technique, the contribution of each layer to the final prediction is observed.

Finding: Experiments show that ARR can automatically classify the two reasoning states with 90% accuracy, eliminating the need for manual labeling and greatly improving the efficiency of interpretability research.

Model Dissection

Two analysis techniques—Attention Knockout and Logit Lens—were applied:

Latent reasoning path: middle layers significantly encode “bridge entity” information, indicating that the model performs a genuine logical jump.

Shortcut mode: early layers already lock onto the final answer, behaving more like memory retrieval than logical inference.

Conference Experience

On the morning of November 15, the author set up a poster. Initially nervous about English fluency, they soon found that explaining the ARR metric and how LLMs differentiate reasoning from shortcuts sparked lively discussions, breaking language barriers.

Interactions continued through the poster session, informal gatherings, and a night‑time social event featuring a Chinese lion dance, where researchers exchanged ideas on scaling laws and RAG robustness.

Overall, the presentation not only introduced a novel quantitative tool for probing LLM internals but also highlighted the collaborative and cultural richness of the EMNLP community.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM Reasoning NLP ARR metric Attention Knockout EMNLP2025 logit lens

Written by

Network Intelligence Research Center (NIRC)

NIRC is based on the National Key Laboratory of Network and Switching Technology at Beijing University of Posts and Telecommunications. It has built a technology matrix across four AI domains—intelligent cloud networking, natural language processing, computer vision, and machine learning systems—dedicated to solving real‑world problems, creating top‑tier systems, publishing high‑impact papers, and contributing significantly to the rapid advancement of China's network technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.