Generative Dense Retrieval: Memory Can Be a Burden
The paper introduces Generative Dense Retrieval (GDR), a two‑stage retrieval framework that first maps queries to memory‑efficient document‑cluster identifiers and then uses dense vectors to locate individual documents, achieving higher recall and better scalability than traditional generative retrieval while incurring modest latency and capacity trade‑offs.
Recently, a paper titled Generative Dense Retrieval: Memory Can Be a Burden from the Xiaohongshu Search Algorithm team was accepted as an oral presentation at the international NLP conference EACL 2024 (acceptance rate 11.32%, 144/1271).
Proposed Paradigm
The authors introduce a novel information‑retrieval paradigm called Generative Dense Retrieval (GDR). GDR addresses the limitations of traditional Generative Retrieval (GR) when dealing with large‑scale corpora, such as fuzzy fine‑grained document features, limited corpus size, and difficulty updating indexes. GDR adopts a coarse‑to‑fine two‑stage retrieval process:
Stage 1 (coarse): a memory‑based mechanism uses a language model’s limited memory to map a query to a document‑cluster identifier (CID).
Stage 2 (fine): a dense‑vector matching mechanism maps the CID to individual documents.
This combination mitigates the inherent drawbacks of GR while preserving its deep interaction advantage.
Key Techniques
Memory‑friendly document‑cluster identifier construction that respects the model’s memory capacity.
Adaptive negative‑sampling strategy within document clusters to strengthen intra‑cluster matching.
Experimental Setup
Experiments are conducted on the Natural Questions (NQ) dataset with various corpus sizes (334K, 1M, 2M, 4M). Baselines include BM25 (sparse retrieval), DPR and AR2 (dense retrieval), and NCI (generative retrieval). Evaluation metrics are Recall@k (R@k) and Accuracy@k (Acc@k).
Main Results
GDR achieves an average +3.0 improvement in R@k and ranks second in Acc@k on NQ, demonstrating superior recall and scalability. When scaling the candidate corpus, GR’s performance drops >15 % while GDR’s degradation stays around 3.5 %, comparable to sparse and dense baselines.
Scalability & New Document Insertion
GDR can incorporate new documents by assigning them to the nearest cluster centroid and updating the vector index, incurring only a 1.9 % drop in R@100 versus an 18.3 % drop for GR, without retraining the entire model.
Limitations
Despite improvements, GDR still lags behind pure dense or sparse retrieval in some aspects due to the autoregressive memory mechanism’s latency and capacity constraints.
The full paper is available at https://arxiv.org/abs/2401.10487 .
Xiaohongshu Tech REDtech
Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.