Why Retrieval‑Augmented Generation Is Still Fragile: Boosting Generalization and Evidence‑Based Answers
Although modern information access is faster than ever, retrieval‑augmented generation systems remain vulnerable, especially when faced with distribution shifts, making it crucial to improve both retriever generalization across domains and languages and ensure generators produce evidence‑grounded responses or refuse when evidence is lacking.
Information acquisition has never been as convenient and rapid, yet it is also more fragile; as language models dominate search and question‑answering, the line between retrieved and generated content blurs.
Contemporary retrieval‑augmented generation (RAG) systems typically follow a pipeline architecture: a retriever filters candidate documents, and a generator crafts answers based on those documents, tightly coupling retrieval and generation.
Reliable performance hinges on two requirements: generalization —the retriever must remain effective on new datasets, domains, and languages; and evidence grounding —the generator must base its output on retrieved evidence and refuse to answer when evidence is insufficient.
This work combines these requirements in a single study. It investigates how training‑data augmentation and negative sampling influence dense retrievers under distribution shift, proposing techniques that enhance cross‑domain and cross‑language robustness.
Additionally, the paper explores training compact open‑source language models to reason over retrieved evidence and to decline answering when evidence is lacking, thereby improving answer reliability.
The implementation leverages the open‑source Simple Transformers library to lower the barrier for building and reproducing transformer‑based retrieval and QA systems. The full research is available at https://hdl.handle.net/11245.1/7817d7ad-bcf9-4517-8f18-2b620facd97d.
Data Party THU
Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
