PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
The PEAR framework introduces a position‑embedding‑agnostic attention re‑weighting method that detects and suppresses detrimental attention heads in large language models, dramatically improving retrieval‑augmented generation performance without adding any inference overhead, as demonstrated on multiple RAG benchmarks and LLM families.
With the rapid growth of large language models (LLMs), Retrieval‑Augmented Generation (RAG) has become a key technique for improving model performance, yet LLMs’ limited context awareness hampers RAG effectiveness.
To address this, Ant Group and researchers from Renmin University of China propose the Position‑Embedding‑Agnostic Attention Re‑weighting (PEAR) framework, which was accepted at WWW2025 and presented orally.
The PEAR framework consists of two stages: (1) RAG suppression‑head detection using a proxy task and path‑patching to identify attention heads that degrade performance, and (2) re‑weighting these heads with learnable scalar coefficients while keeping the original LLM parameters frozen.
Extensive experiments on datasets such as 2WikiMultiHopQA, MuSiQue, and Qasper show that PEAR achieves the highest average scores across baselines, surpassing methods like Ms‑PoE, AB, and MoICE, while incurring zero additional inference cost and minimal GPU memory overhead.
Further evaluations demonstrate that PEAR improves accuracy for models using various positional encodings (RoPE, learned embeddings, Alibi) and does not harm the models’ world‑knowledge abilities, as confirmed by MMLU and ToolAlpaca benchmarks. The approach also generalizes to newer models such as Llama‑3.1‑8B and Qwen‑2.5‑7B‑Instruct.
Overall, PEAR offers a novel, efficient, and broadly applicable solution for enhancing LLM context awareness in RAG tasks, and the open‑source codebase is available at https://github.com/TTArch/PEAR-RAG together with the paper at https://arxiv.org/pdf/2409.19745 .
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.