Tagged articles
8 articles
Page 1 of 1
AI Engineer Programming
AI Engineer Programming
Apr 8, 2026 · Artificial Intelligence

TF‑IDF vs BM25: Statistical Foundations of Text Retrieval for RAG

The article explains how TF‑IDF and BM25 compute term importance, compares their strengths and weaknesses, and shows how these sparse retrieval methods integrate with dense retrieval techniques such as DPR, SPLADE, and ColBERT in Retrieval‑Augmented Generation systems, concluding with a hybrid retrieval decision matrix.

BM25Hybrid RetrievalRAG
0 likes · 14 min read
TF‑IDF vs BM25: Statistical Foundations of Text Retrieval for RAG
DataFunSummit
DataFunSummit
Dec 12, 2024 · Artificial Intelligence

Exploring Generative Retrieval: Memory Mechanisms, GDR Paradigm, and Practical Applications

This presentation examines generative retrieval (GDR), compares it with sparse and dense retrieval paradigms, analyzes memory‑mechanism challenges from an EACL 2024 paper, reports experimental findings, proposes a hybrid GDR‑dense approach, and outlines real‑world application scenarios and future directions.

GDRGenerative RetrievalMemory Mechanism
0 likes · 13 min read
Exploring Generative Retrieval: Memory Mechanisms, GDR Paradigm, and Practical Applications
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jul 29, 2024 · Artificial Intelligence

Scaling Laws for Dense Retrieval: Empirical Study of Model Size, Training Data, and Annotation Quality

The award‑winning study shows that dense retrieval performance follows precise power‑law scaling with model size, training data quantity, and annotation quality, introduces contrast entropy for evaluation, validates joint scaling formulas on MS MARCO and T2Ranking, and uses cost models to guide budget‑optimal resource allocation.

Model Sizeannotation qualitycontrast entropy
0 likes · 13 min read
Scaling Laws for Dense Retrieval: Empirical Study of Model Size, Training Data, and Annotation Quality
DataFunTalk
DataFunTalk
Aug 13, 2023 · Artificial Intelligence

Model Innovation Forum: Advances in Recommendation Systems and Dense Retrieval

The Model Innovation Forum brings together academic and industry experts to discuss cutting‑edge recommendation system models, including efficient dense retrieval, Baidu ranking architectures, offline reinforcement learning, and large‑model inspirations, offering attendees deep technical insights and practical applications.

Model Innovationartificial intelligencedense retrieval
0 likes · 10 min read
Model Innovation Forum: Advances in Recommendation Systems and Dense Retrieval
Baidu Geek Talk
Baidu Geek Talk
Mar 13, 2023 · Artificial Intelligence

Recent Advances in Sparse and Dense Retrieval for Search Engines

The article surveys recent academic advances in both sparse inverted‑index and dense semantic retrieval for large‑scale search, highlighting key papers on decision frameworks, benchmarks, sparse lexical models, dual encoders, and hybrid techniques, while discussing challenges such as single‑vector limits and proposing multi‑view and hybrid solutions.

dense retrievalinformation retrievalpretraining
0 likes · 12 min read
Recent Advances in Sparse and Dense Retrieval for Search Engines
DataFunTalk
DataFunTalk
Sep 13, 2022 · Artificial Intelligence

Intelligent Question Answering in QQ Browser Search: Background, Key Technologies, and Frontier Research

This article presents an in‑depth overview of intelligent question answering in QQ Browser search, covering its background, the core KBQA and DeepQA technologies, system architecture, challenges, recent advances such as end‑to‑end, knowledge‑guided and multimodal QA, and practical Q&A for deployment.

AIDeep LearningKnowledge Graph
0 likes · 22 min read
Intelligent Question Answering in QQ Browser Search: Background, Key Technologies, and Frontier Research
Baidu Geek Talk
Baidu Geek Talk
Nov 29, 2021 · Artificial Intelligence

Pretrained Models for First-Stage Information Retrieval: A Comprehensive Review

This comprehensive review by Dr. Fan Yixing surveys how pretrained language models have transformed first‑stage information retrieval, tracing the shift from traditional term‑based methods to neural sparse, dense, and hybrid approaches, and discussing key challenges such as hard‑negative mining, joint indexing‑representation learning, and generative‑discriminative training.

Hybrid RetrievalNeural IRSparse Retrieval
0 likes · 15 min read
Pretrained Models for First-Stage Information Retrieval: A Comprehensive Review