Tagged articles

Information Retrieval

109 articles · Page 1 of 2

Jun 10, 2026 · Artificial Intelligence

Agent Era Information Retrieval: A Denoising-First Perspective (SIGIR 2026 Review)

The SIGIR 2026 review argues that as large language models become the primary consumers of retrieved results, information retrieval must shift its core objective from pure recall to denoising, presenting a five‑stage pipeline, controlled experiments, and a detailed attribution framework for noise sources.

AgentDenoisingInformation Retrieval

0 likes · 11 min read

Agent Era Information Retrieval: A Denoising-First Perspective (SIGIR 2026 Review)

AI Engineer Programming

Jun 6, 2026 · Artificial Intelligence

How Query Rewriting Boosts Retrieval in RAG Systems

In RAG applications, ambiguous user queries often hinder retrieval effectiveness, so rewriting queries before search—through normalization, synonym expansion, linguistic rules, LLM‑based generation, query decomposition, and multi‑view strategies—can improve relevance, but must avoid over‑expansion, semantic drift, and added latency.

Information RetrievalLLMPrompt engineering

0 likes · 11 min read

How Query Rewriting Boosts Retrieval in RAG Systems

DeepHub IMBA

May 27, 2026 · Artificial Intelligence

Testing Four Non‑Vector RAG Approaches: BM25, GraphRAG, Tree Search, and Agentic Search

The article evaluates four non‑vector Retrieval‑Augmented Generation methods—BM25 lexical search, GraphRAG graph traversal, Tree‑Search document navigation, and an Agentic search loop—using a small JSON‑based corpus, showing each method’s strengths, weaknesses, and when to combine them for production‑grade retrieval.

Agentic SearchBM25GraphRAG

0 likes · 12 min read

Testing Four Non‑Vector RAG Approaches: BM25, GraphRAG, Tree Search, and Agentic Search

AI Engineer Programming

May 15, 2026 · Artificial Intelligence

Hybrid Retrieval in RAG: Combining BM25 Precision with Dense Vector Semantics

The article examines why pure vector retrieval in RAG lacks lexical precision and traceable relevance scores, explains BM25's strengths, and presents hybrid retrieval architectures—including RRF and linear combination fusion—as well as the trade‑offs of externalizing the fusion process.

BM25Hybrid SearchInformation Retrieval

0 likes · 9 min read

Hybrid Retrieval in RAG: Combining BM25 Precision with Dense Vector Semantics

DeepHub IMBA

Apr 30, 2026 · Artificial Intelligence

Why Real RAG Systems Need Both BM25 and Vector Search

The article analyzes how BM25 excels at exact token matching while vector embeddings capture semantic intent, explains their distinct failure modes, and shows that a hybrid retriever—combined with metadata filtering, proper chunking, and reciprocal rank fusion—delivers the most reliable results for RAG pipelines.

BM25EmbeddingHybrid Retrieval

0 likes · 17 min read

Why Real RAG Systems Need Both BM25 and Vector Search

PaperAgent

Apr 27, 2026 · Artificial Intelligence

A Comprehensive Review of Modern LLM Agent Memory Frameworks

The article surveys recent LLM‑based agent memory research, presenting a unified framework that breaks memory systems into four components, detailing their design choices, experimental evaluation on LOCOMO and LONGMEMEVAL, key findings, and a new low‑token SOTA architecture.

Agent MemoryEvaluationInformation Retrieval

0 likes · 8 min read

A Comprehensive Review of Modern LLM Agent Memory Frameworks

AI Explorer

Apr 22, 2026 · Artificial Intelligence

How AI‑Powered TrendRadar Provides a Private, Automated Info Radar to Cut Through Noise

TrendRadar, an open‑source Python project with over 54,000 GitHub stars, combines multi‑platform aggregation, large‑model AI filtering, sentiment analysis, and multi‑channel push to deliver a private, Docker‑deployable information radar that lets users define keywords and receive concise, translated summaries in seconds.

AIDockerInformation Retrieval

0 likes · 6 min read

How AI‑Powered TrendRadar Provides a Private, Automated Info Radar to Cut Through Noise

AI Engineer Programming

Apr 8, 2026 · Artificial Intelligence

TF‑IDF vs BM25: Statistical Foundations of Text Retrieval for RAG

The article explains how TF‑IDF and BM25 compute term importance, compares their strengths and weaknesses, and shows how these sparse retrieval methods integrate with dense retrieval techniques such as DPR, SPLADE, and ColBERT in Retrieval‑Augmented Generation systems, concluding with a hybrid retrieval decision matrix.

BM25Hybrid RetrievalInformation Retrieval

0 likes · 14 min read

TF‑IDF vs BM25: Statistical Foundations of Text Retrieval for RAG

AI Explorer

Apr 2, 2026 · Artificial Intelligence

AI Agent Skill for Global Hot‑Topic Tracking and Data‑Driven Insights

The open‑source /last30days AI skill aggregates and analyzes recent hot content from Reddit, X, YouTube, Hacker News, Bluesky, Polymarket, Instagram Reels and TikTok, applying a multi‑signal quality ranking and data‑driven narrative to deliver structured, citation‑rich briefings that can be integrated into Claude Code or other workflows.

AI AgentClaude CodeInformation Retrieval

0 likes · 7 min read

AI Agent Skill for Global Hot‑Topic Tracking and Data‑Driven Insights

Wu Shixiong's Large Model Academy

Apr 2, 2026 · Artificial Intelligence

How Smart Chunk Splitting Boosts RAG Recall from 67% to 91%

This article examines the critical role of chunk splitting in Retrieval‑Augmented Generation systems, comparing three generations of methods—from fixed‑size token cuts to sentence‑aware and semantic‑aware strategies—showing how refined chunking, overlap tuning, and metadata design raise Recall@5 from 0.67 to 0.91 while addressing table, list, and long‑section challenges.

ChunkingInformation RetrievalLLM

0 likes · 24 min read

How Smart Chunk Splitting Boosts RAG Recall from 67% to 91%

Wu Shixiong's Large Model Academy

Mar 26, 2026 · Artificial Intelligence

Why Hybrid Retrieval Beats Pure Vector Search: BM25, RRF, and Real‑World Gains

This article explains why combining BM25 with dense vector search using Reciprocal Rank Fusion (RRF) improves recall for both exact‑term and semantic queries in a financial‑insurance document corpus, details the underlying algorithms, parameter choices such as k=60, provides Python implementations, and shows measurable performance gains in production.

BM25FAISSHybrid Retrieval

0 likes · 28 min read

Why Hybrid Retrieval Beats Pure Vector Search: BM25, RRF, and Real‑World Gains

o-ai.tech

Mar 16, 2026 · Industry Insights

Your AI Answers Could Be Shaped by Paid Brand Editing

Brands are increasingly paying to embed favorable content on platforms like Zhihu and Xiaohongshu, a practice dubbed Generative Engine Optimization (GEO), which manipulates the information AI retrieves, making many AI-generated product recommendations subtly biased without any disclosure.

AI biasGEOGenerative Engine Optimization

0 likes · 8 min read

PaperAgent

Feb 6, 2026 · Artificial Intelligence

How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points

The paper introduces xMemory, a hierarchical "split‑aggregate‑retrieve" framework that reduces token usage by up to 30% and improves QA performance by more than 10 points in long‑range agent conversations, outperforming traditional RAG across multiple LLMs.

Agent MemoryHierarchical RetrievalInformation Retrieval

0 likes · 8 min read

How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points

JD Cloud Developers

Feb 4, 2026 · Artificial Intelligence

How Deep Research Transforms LLMs into Autonomous AI Researchers

This article examines Deep Research, an AI system that adds autonomous planning and deep reasoning to large language models, enabling them to browse the web, perform long‑chain reasoning, and generate professional, citation‑rich reports for complex tasks such as industry trend analysis and technical competitive research.

AI researchAutonomous AgentsInformation Retrieval

0 likes · 22 min read

How Deep Research Transforms LLMs into Autonomous AI Researchers

JD Tech Talk

Feb 4, 2026 · Artificial Intelligence

How Deep Research Turns LLMs into Autonomous AI Researchers

This article explains the background, core features, underlying ReAct‑based architecture, and engineering solutions of Deep Research—a system that equips large language models with autonomous planning, long‑chain reasoning, and professional report generation to tackle complex information‑intensive tasks.

AI researchAutonomous AgentsInformation Retrieval

0 likes · 21 min read

How Deep Research Turns LLMs into Autonomous AI Researchers

PaperAgent

Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

Information RetrievalLLMMulti-hop QA

0 likes · 7 min read

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

JD Cloud Developers

Nov 21, 2025 · Artificial Intelligence

Why Chunking Strategy Makes or Breaks RAG Performance

This article explains how different chunking methods—fixed size, semantic, recursive, document‑based, agent‑driven, sentence‑level, and paragraph‑level—affect Retrieval‑Augmented Generation, offering practical guidelines, metrics, and optimization tips for real‑world deployments.

AIChunkingInformation Retrieval

0 likes · 9 min read

Why Chunking Strategy Makes or Breaks RAG Performance

Xuanwu Backend Tech Stack

Oct 22, 2025 · Artificial Intelligence

How Rerank Transforms Retrieval‑Augmented Generation for Accurate AI Answers

This article explains the limitations of basic Retrieval‑Augmented Generation (RAG), introduces Rerank technology as a two‑step refinement process, compares dual‑encoder and cross‑encoder methods, and reviews popular Rerank models to help developers build more precise AI‑driven retrieval systems.

Artificial IntelligenceInformation RetrievalRAG

0 likes · 10 min read

How Rerank Transforms Retrieval‑Augmented Generation for Accurate AI Answers

Alibaba Cloud Developer

Sep 1, 2025 · Artificial Intelligence

Mastering RAG: From Chunking to Hybrid Search for Better AI Retrieval

This article delves into the implementation details and optimization strategies of Retrieval‑Augmented Generation (RAG), covering document chunking, index enhancement, embedding, hybrid search, and re‑ranking, and provides practical code examples to help developers move from quick deployment to deep performance tuning.

AIChunkingEmbedding

0 likes · 19 min read

Mastering RAG: From Chunking to Hybrid Search for Better AI Retrieval

Instant Consumer Technology Team

Jun 4, 2025 · Artificial Intelligence

Unlocking Retrieval-Augmented Generation: Theory, Practice, and Future Trends

This comprehensive article examines Retrieval‑Augmented Generation (RAG), covering its historical evolution, core theory, implementation variants, practical code examples, diverse applications, current controversies, and future research directions within the AI and NLP landscape.

Artificial IntelligenceInformation RetrievalRAG

0 likes · 21 min read

Unlocking Retrieval-Augmented Generation: Theory, Practice, and Future Trends

Baidu Geek Talk

Apr 7, 2025 · Artificial Intelligence

COBRA: Unified Generative Recommendations with Cascaded Sparse-Dense Representations

COBRA, Baidu’s new generative retrieval framework, unifies sparse ID generation and dense vector encoding through a cascaded architecture that first predicts hierarchical IDs then refines them into dense representations, achieving state‑of‑the‑art recall, NDCG and conversion gains across public benchmarks and large‑scale advertising production.

AICOBRAInformation Retrieval

0 likes · 13 min read

COBRA: Unified Generative Recommendations with Cascaded Sparse-Dense Representations

Alibaba Cloud Developer

Mar 25, 2025 · Artificial Intelligence

Boost Your AI Search Skills: Advanced Prompt & Query Tricks

This guide explains how to leverage AI tools with deep web‑search capabilities, covering site‑specific queries, wildcard operators, date ranges, Boolean logic, and effective prompt engineering techniques—including Socratic questioning and CRISPE framework—to improve information retrieval accuracy and efficiency across various domains.

AIInformation RetrievalLarge Language Models

0 likes · 8 min read

Boost Your AI Search Skills: Advanced Prompt & Query Tricks

Architect

Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI ReliabilityInformation RetrievalLLM

0 likes · 32 min read

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Baidu Tech Salon

Mar 21, 2025 · Artificial Intelligence

Semantic Embedding with Large Language Models: A Comprehensive Survey

This survey reviews the evolution of semantic embedding—from Word2vec and GloVe to BERT, Sentence‑BERT, and recent contrastive methods—then examines how large language models improve embeddings via synthetic data generation and backbone architectures, detailing techniques such as contrastive prompting, in‑context learning, knowledge distillation, and discussing resource, privacy, and interpretability challenges.

In-Context LearningInformation RetrievalNLP

0 likes · 27 min read

Semantic Embedding with Large Language Models: A Comprehensive Survey

JD Tech

Feb 5, 2025 · Artificial Intelligence

Tech Insight: Highlights of Ten JD Retail Technology Papers Published in Top AI Conferences (2024)

Tech Insight presents concise overviews of ten JD retail technology papers accepted at top AI conferences in 2024, covering topics such as open‑vocabulary object detection, multi‑scenario ranking, diversity‑aware re‑ranking, a diversified product search dataset, semi‑supervised query classification, plug‑in CTR models, and methods to mitigate LLM hallucinations.

AIInformation RetrievalRanking

0 likes · 17 min read

Tech Insight: Highlights of Ten JD Retail Technology Papers Published in Top AI Conferences (2024)

Baidu Tech Salon

Jan 21, 2025 · Artificial Intelligence

How AI Is Transforming Legal Research: Inside the YuanDian WenDa Smart Q&A Engine

Faced with billions of legal documents and the shortcomings of keyword search, Chinese legal professionals are turning to the AI‑powered YuanDian WenDa engine, which leverages Baidu's Wenxin model, a structured legal database, and prompt‑engineering to deliver trustworthy, citation‑rich answers and rapid research reports.

AIInformation RetrievalLegalTech

0 likes · 10 min read

How AI Is Transforming Legal Research: Inside the YuanDian WenDa Smart Q&A Engine

JD Retail Technology

Jan 21, 2025 · Artificial Intelligence

Tech Insight: Selected JD Retail Technology Papers in Artificial Intelligence (2024)

Tech Insight highlights ten 2024 JD Retail Technology AI papers presented at top conferences—including CVPR, SIGIR, WWW, AAAI and IJCAI—that advance open‑vocabulary object detection, unified search‑recommendation, pre‑ranking consistency, diversity‑aware re‑ranking, a diversified product‑search dataset, graph‑based query classification, plug‑in CTR models, parallel ad‑ranking, trajectory‑based CTR stability, and task‑aware decoding for large language models.

Artificial IntelligenceCTR PredictionE‑commerce

0 likes · 20 min read

Tech Insight: Selected JD Retail Technology Papers in Artificial Intelligence (2024)

AsiaInfo Technology: New Tech Exploration

Dec 30, 2024 · Artificial Intelligence

How RAG Fusion Revolutionizes Information Retrieval: Mechanisms, Benefits, and Future Directions

This article examines RAG Fusion, a retrieval‑augmented generation technique that combines multi‑query generation, reciprocal rank fusion, and contextual relevance improvements to boost search accuracy, discusses its workflow, mathematical foundation, advantages, challenges, real‑world applications, and emerging research directions.

AIInformation RetrievalRAG Fusion

0 likes · 15 min read

How RAG Fusion Revolutionizes Information Retrieval: Mechanisms, Benefits, and Future Directions

Baobao Algorithm Notes

Dec 18, 2024 · Artificial Intelligence

How STAR Enables Training‑Free Recommendations with Large Language Models

The article reviews the STAR framework, a training‑free recommendation approach that leverages large language model embeddings and collaborative co‑occurrence scores to retrieve and rank items, and evaluates its performance, hyper‑parameter effects, and ablation studies against existing LLM‑based recommender methods.

Artificial IntelligenceInformation RetrievalLLM

0 likes · 10 min read

How STAR Enables Training‑Free Recommendations with Large Language Models

Alibaba Cloud Developer

Nov 18, 2024 · Artificial Intelligence

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

This article shares a half‑year of hands‑on experience with Retrieval‑Augmented Generation, analyzing why simple RAG setups often feel unintelligent, identifying three core knowledge issues, and presenting concrete optimization strategies—including chunking, knowledge expansion, and tag‑based conflict resolution—to improve retrieval and generation performance in low‑resource environments.

AIInformation RetrievalLarge Language Models

0 likes · 25 min read

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

Aikesheng Open Source Community

Nov 12, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

ChatDBA is a conversational AI system built by Shanghai Aikesheng that employs large language models and Retrieval‑Augmented Generation to help database administrators diagnose faults, learn domain knowledge, and generate or optimize SQL, with a redesigned architecture that addresses early‑stage shortcomings and outlines future enhancements.

ChatDBAFault diagnosisInformation Retrieval

0 likes · 10 min read

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

Baobao Algorithm Notes

Sep 10, 2024 · Artificial Intelligence

Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation

This article reviews the ACL 2024 paper that investigates how large language model‑generated text influences retrieval‑augmented generation pipelines, revealing short‑term retrieval gains but a long‑term “spiral of silence” that marginalizes human‑generated content and homogenizes open‑domain QA results.

AI impactInformation RetrievalLLM

0 likes · 9 min read

Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation

Xiaohongshu Tech REDtech

Jul 29, 2024 · Artificial Intelligence

Scaling Laws for Dense Retrieval: Empirical Study of Model Size, Training Data, and Annotation Quality

The award‑winning study shows that dense retrieval performance follows precise power‑law scaling with model size, training data quantity, and annotation quality, introduces contrast entropy for evaluation, validates joint scaling formulas on MS MARCO and T2Ranking, and uses cost models to guide budget‑optimal resource allocation.

Information RetrievalModel Sizeannotation quality

0 likes · 13 min read

Scaling Laws for Dense Retrieval: Empirical Study of Model Size, Training Data, and Annotation Quality

Meituan Technology Team

Jun 27, 2024 · Artificial Intelligence

Meituan Technical Team's Three Papers Accepted at SIGIR 2024: Ad Auction Integration, Federated Recommendation, and POI Recommendation

The article highlights three Meituan research papers accepted at SIGIR 2024—covering deep automated mechanism design for ad auction, a retrieval‑enhanced vertical federated recommendation framework, and disentangled contrastive hypergraph learning for next POI recommendation—and announces an online sharing event where the authors will present their work.

AI researchAd AuctionFederated Recommendation

0 likes · 9 min read

Meituan Technical Team's Three Papers Accepted at SIGIR 2024: Ad Auction Integration, Federated Recommendation, and POI Recommendation

Xiaohongshu Tech REDtech

Apr 28, 2024 · Artificial Intelligence

Generative Dense Retrieval: Memory Can Be a Burden

The paper introduces Generative Dense Retrieval (GDR), a two‑stage retrieval framework that first maps queries to memory‑efficient document‑cluster identifiers and then uses dense vectors to locate individual documents, achieving higher recall and better scalability than traditional generative retrieval while incurring modest latency and capacity trade‑offs.

Information RetrievalMemory Mechanismgenerative dense retrieval

0 likes · 13 min read

Generative Dense Retrieval: Memory Can Be a Burden

DataFunTalk

Mar 15, 2024 · Artificial Intelligence

Application of Agent Technology in Voice Assistant Scenarios

Senior algorithm engineer Qi Jianwei from Xiaomi presents a comprehensive overview of building a large‑model‑centric Agent framework for voice assistants, covering prompt design, information retrieval, RAG processes, and future optimization directions to enhance performance and stability.

AgentInformation RetrievalPrompt engineering

0 likes · 2 min read

Application of Agent Technology in Voice Assistant Scenarios

Ops Development & AI Practice

Mar 14, 2024 · Artificial Intelligence

Do Vector Embeddings Offer the Same Consistency as Hash Functions?

While both vectorization and hashing are essential for handling large datasets, this article examines whether vector embeddings can match the deterministic consistency of hash functions, comparing their collision handling, data structure design implications, and suitability for retrieval and machine‑learning tasks.

AIHashingInformation Retrieval

0 likes · 8 min read

Do Vector Embeddings Offer the Same Consistency as Hash Functions?

Ops Development & AI Practice

Mar 13, 2024 · Artificial Intelligence

How Vector Retrieval Powers AI Model Training and Real-World Applications

Vector retrieval, based on converting data into high‑dimensional vectors and measuring similarity, enables fast, accurate search across massive datasets, supporting AI tasks such as search engines, recommendation, NLP, and computer vision, and plays a crucial role in large‑model training for data selection, anomaly detection, and model optimization.

AI trainingInformation RetrievalRecommendation Systems

0 likes · 6 min read

How Vector Retrieval Powers AI Model Training and Real-World Applications

php Courses

Feb 18, 2024 · Backend Development

Implementing Information Retrieval and SEO with PHP

This article explains the fundamentals of information retrieval and search engine optimization and provides practical PHP code examples for keyword search, full‑text search, and common SEO techniques such as keyword, internal, and external link optimization.

Information RetrievalSEOWeb Optimization

0 likes · 7 min read

Implementing Information Retrieval and SEO with PHP

政采云技术

Dec 19, 2023 · Backend Development

Principles and Simple Implementation of a Search Engine in Go

This article explains the fundamental concepts of search engine technology—including forward and inverted indexes, tokenizers, stop words, synonym handling, ranking algorithms, and NLP integration—and provides a concise Go implementation with code examples and performance testing.

GoInformation RetrievalNLP

0 likes · 21 min read

Principles and Simple Implementation of a Search Engine in Go

php Courses

Aug 31, 2023 · Backend Development

Implementing Information Retrieval and SEO with PHP

This article explains the fundamentals of information retrieval and search engine optimization, demonstrating how to implement keyword and full‑text search using PHP and MySQL, and presenting practical PHP techniques for keyword, internal, and external link optimization to improve website visibility.

Backend DevelopmentInformation RetrievalSEO

0 likes · 6 min read

JD Cloud Developers

Aug 22, 2023 · Artificial Intelligence

A Practical Guide to Recommendation System Architecture and Methods

This article provides a concise overview of recommendation systems, covering their definition, core framework of recall, ranking, and re‑ranking, various recall strategies including multi‑path and vector‑based methods, similarity calculations, and practical implementation details such as AB testing and code examples.

AB testingInformation RetrievalRanking

0 likes · 14 min read

A Practical Guide to Recommendation System Architecture and Methods

360 Tech Engineering

Aug 16, 2023 · Artificial Intelligence

Improving ChatGPT Real‑time Accuracy with Document Retrieval: A Practical Approach

This article examines ChatGPT's limitations in real‑time information and answer accuracy, then proposes a retrieval‑augmented method that combines up‑to‑date document search with large language models to deliver more reliable and current responses across various scenarios.

AIAccuracyChatGPT

0 likes · 7 min read

Improving ChatGPT Real‑time Accuracy with Document Retrieval: A Practical Approach

Architect

May 29, 2023 · Artificial Intelligence

Understanding Embeddings and Vector Databases for LLM Applications

This article explains what embeddings and vector databases are, how they are generated with models like OpenAI's Ada, why they enable semantic search and help overcome large language model token limits, and demonstrates a practical workflow for retrieving relevant document chunks using cosine similarity.

Information RetrievalLLMembeddings

0 likes · 7 min read

Understanding Embeddings and Vector Databases for LLM Applications

360 Quality & Efficiency

May 26, 2023 · Artificial Intelligence

Enhancing ChatGPT Real‑Time Accuracy through Document Retrieval: A Practical Approach

The article examines ChatGPT's limitations in timeliness and factual accuracy, especially for security‑related queries, and proposes a method that combines external document search with the model to deliver up‑to‑date, reliable answers across intelligent‑assistant scenarios.

AccuracyArtificial IntelligenceChatGPT

0 likes · 8 min read

Enhancing ChatGPT Real‑Time Accuracy through Document Retrieval: A Practical Approach

Baidu Geek Talk

Mar 13, 2023 · Artificial Intelligence

Recent Advances in Sparse and Dense Retrieval for Search Engines

The article surveys recent academic advances in both sparse inverted‑index and dense semantic retrieval for large‑scale search, highlighting key papers on decision frameworks, benchmarks, sparse lexical models, dual encoders, and hybrid techniques, while discussing challenges such as single‑vector limits and proposing multi‑view and hybrid solutions.

Information RetrievalRankingdense retrieval

0 likes · 12 min read

Recent Advances in Sparse and Dense Retrieval for Search Engines

DataFunTalk

Jan 18, 2023 · Artificial Intelligence

Search Relevance System Architecture and Practices in QQ Browser

This article presents the QQ Browser search relevance team's experience integrating QQ Browser and Sogou search systems, detailing business overview, relevance system evolution, algorithm architecture, evaluation metrics, deep semantic matching, relevance calibration, and model distillation techniques to improve search relevance performance.

Evaluation MetricsInformation Retrievalmodel distillation

0 likes · 31 min read

Search Relevance System Architecture and Practices in QQ Browser

Tencent Cloud Developer

Jan 9, 2023 · Artificial Intelligence

Search Relevance Architecture and Practices in QQ Browser

The QQ Browser search relevance team describes a unified, billion‑scale architecture that combines a main and vertical subsystem, a pyramid‑shaped ranking pipeline (recall, coarse, fine), a dedicated GPU‑accelerated relevance service, and hybrid semantic‑matching models (dual‑tower, BERT, matrix fusion) evaluated with offline and online metrics to deliver accurate, fresh, and authoritative results for diverse content and long‑tail queries.

Deep LearningEvaluation MetricsInformation Retrieval

0 likes · 28 min read

Search Relevance Architecture and Practices in QQ Browser

IT Services Circle

Jan 9, 2023 · Fundamentals

11 Google Search Techniques to Find Information Faster

This article presents eleven practical Google search tricks—including keyword matching, exact phrases, site‑specific queries, file‑type filters, and time ranges—to help programmers and other users retrieve relevant information more efficiently and improve overall productivity.

GoogleInformation RetrievalSearch Tips

0 likes · 6 min read

11 Google Search Techniques to Find Information Faster

Su San Talks Tech

Jan 8, 2023 · Fundamentals

11 Powerful Google Search Tricks to Find Information Faster

Discover eleven practical Google search techniques—from using spaces, vertical bars, and quotes to applying wildcards, site filters, filetype limits, and time ranges—that help programmers and anyone else locate precise information quickly and efficiently.

Google SearchInformation RetrievalTips

0 likes · 6 min read

11 Powerful Google Search Tricks to Find Information Faster

Alimama Tech

Nov 9, 2022 · Artificial Intelligence

Graph-based Weakly Supervised Framework for Semantic Relevance Learning in E-commerce

The paper introduces a graph‑based weakly supervised contrastive learning framework that uses heterogeneous user‑behavior graphs, e‑commerce‑specific augmentations, and a hybrid fine‑tuning/transfer learning strategy to improve semantic relevance matching between queries and product titles, achieving significant gains on a large‑scale Taobao dataset.

Graph Neural NetworksInformation RetrievalWeak Supervision

0 likes · 12 min read

Graph-based Weakly Supervised Framework for Semantic Relevance Learning in E-commerce

DataFunTalk

Oct 11, 2022 · Artificial Intelligence

Search vs Recommendation vs Advertising: Concepts, Differences, and System Architectures

This article provides an overview of search, recommendation, and advertising as core internet services, comparing their problem definitions, business goals, algorithmic models, and system architectures across web, e‑commerce, and O2O scenarios, while outlining historical development and key industry examples.

AIAdvertisingInformation Retrieval

0 likes · 13 min read

Search vs Recommendation vs Advertising: Concepts, Differences, and System Architectures

Meituan Technology Team

Jul 21, 2022 · Artificial Intelligence

Overview of Meituan Technical Team Papers Featured at ACM SIGIR 2022 and Related Works

The article highlights ten representative Meituan technical papers accepted at ACM SIGIR 2022, spanning personalized opinion tagging, cross‑domain sentiment classification, dialogue summarization transfer, universal retrieval, CTR prediction, image behavior modeling, and topic segmentation, each summarized with abstracts and download links for researchers.

Information RetrievalRecommendation Systemscross-domain learning

0 likes · 25 min read

Overview of Meituan Technical Team Papers Featured at ACM SIGIR 2022 and Related Works

Hulu Beijing

May 26, 2022 · Artificial Intelligence

Why Vector Retrieval Outperforms Keyword Search for Personalized Video Discovery

This article explains how modern video platforms combine traditional keyword retrieval with deep‑learning‑based vector retrieval, detailing model architectures, attention mechanisms, personalization features, offline experiments, and online A/B results that show significant improvements in recall, relevance, and user experience.

Deep LearningInformation RetrievalSearch Engine

0 likes · 18 min read

Why Vector Retrieval Outperforms Keyword Search for Personalized Video Discovery

Hulu Beijing

May 18, 2022 · Artificial Intelligence

How Hulu Optimizes Video Search for TV Remotes and Short Queries

This article examines Hulu's video search engine, highlighting challenges such as ensuring relevance beyond text matching, handling ultra‑short queries on TV remotes, addressing content gaps, and integrating AI‑driven query understanding, retrieval, and ranking to improve user experience.

HuluInformation Retrievalmedia streaming

0 likes · 7 min read

How Hulu Optimizes Video Search for TV Remotes and Short Queries

Alimama Tech

Apr 6, 2022 · Artificial Intelligence

Alibaba's Five Papers Accepted at SIGIR 2022

Alibaba’s research team had five papers accepted at the prestigious SIGIR 2022 conference in Madrid, covering innovations such as joint ad‑ranking and creative selection, personalized bundle generation, calibrated neural predictions, disentangled counterfactual regression, and cold‑start user recommendation, showcasing strong expertise in information retrieval and online advertising.

CalibrationInformation RetrievalOnline Advertising

0 likes · 8 min read

Alibaba's Five Papers Accepted at SIGIR 2022

DataFunTalk

Mar 16, 2022 · Artificial Intelligence

A Survey of Entity Linking: Definitions, Methods, and Applications

This article provides a comprehensive overview of entity linking, detailing its definition, the two-stage pipeline of entity recognition and disambiguation, common methodologies such as candidate generation and ranking, advanced approaches, challenges like unlinkable mentions, and various applications in knowledge graphs, text mining, and question answering.

Information Retrievalentity linkingnatural language processing

0 likes · 15 min read

A Survey of Entity Linking: Definitions, Methods, and Applications

Baidu Geek Talk

Nov 29, 2021 · Artificial Intelligence

Pretrained Models for First-Stage Information Retrieval: A Comprehensive Review

This comprehensive review by Dr. Fan Yixing surveys how pretrained language models have transformed first‑stage information retrieval, tracing the shift from traditional term‑based methods to neural sparse, dense, and hybrid approaches, and discussing key challenges such as hard‑negative mining, joint indexing‑representation learning, and generative‑discriminative training.

Hybrid RetrievalInformation RetrievalNeural IR

0 likes · 15 min read

Pretrained Models for First-Stage Information Retrieval: A Comprehensive Review

ByteDance SE Lab

Oct 29, 2021 · Artificial Intelligence

What Is a Knowledge Graph? From Basics to Embedding Techniques

This article introduces knowledge graphs, defining them as semantic networks or multi‑relational graphs, explains entities and relations, compares RDF and graph‑database storage, outlines construction steps including entity extraction and ontology building, reviews embedding models like TransE/H/R/D, and explores applications in search, finance, recommendation, and language models.

AIInformation Retrievalgraph embedding

0 likes · 22 min read

What Is a Knowledge Graph? From Basics to Embedding Techniques

DataFunTalk

Sep 24, 2021 · Artificial Intelligence

Intelligent Question Answering in QQ Browser Search Engine: KBQA, DeepQA, and IRQA

This article presents the architecture, techniques, and practical solutions behind intelligent question answering in QQ Browser's search engine, covering knowledge‑graph based QA (KBQA), machine‑reading‑comprehension QA (DeepQA), and information‑retrieval QA (IRQA), and discusses system design, model optimization, and future directions.

AIInformation RetrievalSearch Engine

0 likes · 23 min read

Intelligent Question Answering in QQ Browser Search Engine: KBQA, DeepQA, and IRQA

DataFunTalk

Sep 3, 2021 · Artificial Intelligence

Construction and Application of an Interest Point Graph for Content Understanding in Information Feed Recommendation

This article explains how large‑scale UGC data is used to build a multi‑type interest point graph, describes the mining, hierarchical and associative relationship extraction methods, and demonstrates how the graph improves content understanding and recommendation accuracy while mitigating filter‑bubble effects.

Artificial IntelligenceGraph Neural NetworksInformation Retrieval

0 likes · 25 min read

Construction and Application of an Interest Point Graph for Content Understanding in Information Feed Recommendation

DataFunTalk

Aug 2, 2021 · Databases

From Text Search to Vector Search: Generalizing Unstructured Data Retrieval

The article explains why traditional text‑based search engines like ElasticSearch struggle with modern multimodal data, introduces vector databases that store implicit semantic embeddings, and proposes a generalized search architecture that decouples data‑to‑vector mapping from the engine while leveraging clustering or graph indexes for similarity search.

AIEmbeddingInformation Retrieval

0 likes · 12 min read

From Text Search to Vector Search: Generalizing Unstructured Data Retrieval

iQIYI Technical Product Team

Jul 30, 2021 · Artificial Intelligence

iQIYI Search Ranking Algorithm Practice – NLP and Search Integration

At iQIYI’s iTech Conference, Zhang Zhigang detailed a full‑stack search ranking system that combines NLP‑driven query analysis, hierarchical indexing, multi‑stage coarse‑to‑fine ranking, Transformer‑based re‑ranking, sparse‑feature DNN enhancements and LIME/SE‑Block explainability, delivering measurable gains in CTR and NDCG for the platform’s video search.

Information RetrievalNLPiQIYI

0 likes · 20 min read

iQIYI Search Ranking Algorithm Practice – NLP and Search Integration

We-Design

May 31, 2021 · Product Management

Mastering Search Design: 5 Essential Stages for Better User Experiences

This article breaks down the evolving problem space of search and walks through its five core stages—request acquisition, parsing, matching, ranking, and result presentation—offering practical design decisions and best‑practice tips to create more effective search experiences.

Information RetrievalProduct designUI/UX

0 likes · 21 min read

Mastering Search Design: 5 Essential Stages for Better User Experiences

DataFunSummit

Apr 8, 2021 · Artificial Intelligence

Evaluation Metrics and Methods for Recommendation Systems

This article explains the purpose, dimensions, and specific quantitative metrics—such as accuracy, surprise, diversity, RMSE, MAE, R‑squared, MAP, MRR, ROC and AUC—used to evaluate recommendation systems, covering user, platform, item, and system perspectives for practical AI deployments.

Evaluation MetricsInformation Retrieval

0 likes · 13 min read

Evaluation Metrics and Methods for Recommendation Systems

58 Tech

Mar 29, 2021 · Artificial Intelligence

Deep Semantic Model Exploration and Application in 58 Search

This article presents a comprehensive overview of 58 Search's multi‑stage retrieval system, compares term‑match and semantic matching, details the design, training, and optimization of interactive, dual‑tower, and semi‑interactive BERT‑based semantic models, and discusses their practical deployment in ranking and recall stages.

AIBERTInformation Retrieval

0 likes · 18 min read

Deep Semantic Model Exploration and Application in 58 Search

DataFunTalk

Dec 14, 2020 · Artificial Intelligence

Query Expansion Techniques: Relevance Modeling vs. Generative Approaches and Future Directions

This article reviews current query expansion methods, contrasting relevance‑based models that rely on terms or entities with generative models that encode whole queries, discusses challenges of handling long and complex queries, and surveys recent research on encoding queries, session modeling, and multi‑task feature integration.

Information RetrievalNLPQuery Expansion

0 likes · 9 min read

Query Expansion Techniques: Relevance Modeling vs. Generative Approaches and Future Directions

DeWu Technology

Dec 4, 2020 · Fundamentals

Introduction to Search Engine Technology and Information Retrieval

The article surveys core search‑engine technology—document hierarchy, flat and vertical inverted indexes, query operators for building and merging score lists, and ranking models from Boolean and BM25 to language‑model approaches like Indri—providing a foundational overview of information retrieval.

BM25Information RetrievalSearch Engine

0 likes · 14 min read

Introduction to Search Engine Technology and Information Retrieval

DataFunTalk

Nov 16, 2020 · Artificial Intelligence

Deep Semantic Relevance and Multimodal Video Search at Alibaba Entertainment

The presentation by Alibaba Entertainment's senior algorithm expert details the challenges of video search in the 4G/5G era and describes a comprehensive framework covering business overview, relevance and ranking, multimodal retrieval, deep semantic modeling, dataset construction, and practical deployment techniques.

Deep LearningInformation RetrievalMultimodal

0 likes · 27 min read

Deep Semantic Relevance and Multimodal Video Search at Alibaba Entertainment

DataFunTalk

Nov 4, 2020 · Artificial Intelligence

Intelligent E‑commerce Search: Architecture, Techniques, and Real‑World Impact

This article explores the evolution of e‑commerce search, detailing why search matters, the technical pipeline—including query preprocessing, entity and intent recognition, knowledge‑graph construction, recall, coarse and fine ranking—and demonstrates substantial performance gains through real‑world case studies.

AIInformation RetrievalRanking

0 likes · 16 min read

Intelligent E‑commerce Search: Architecture, Techniques, and Real‑World Impact

ITPUB

Oct 23, 2020 · Fundamentals

How General Search Engines Work: From Crawlers to Ranking

This article provides a comprehensive overview of general search engines, covering their classification, core workflow, key modules such as web crawlers, content processing, storage, user query handling, ranking strategies like TF‑IDF and PageRank, as well as anti‑cheat measures and user intent understanding.

Information RetrievalPageRankSearch Engine

0 likes · 16 min read

How General Search Engines Work: From Crawlers to Ranking

DataFunTalk

Oct 13, 2020 · Artificial Intelligence

Query Term Weighting Techniques for Medical Search: Statistical, Supervised, and Neural Approaches

This article reviews the challenges of short‑text query understanding in medical search and surveys a range of term‑weighting methods—including statistical models, supervised weighting, knowledge‑graph‑enhanced extraction, and neural network‑based approaches—highlighting their assumptions, implementations, and practical considerations for improving retrieval relevance.

Information Retrievalknowledge graphmedical search

0 likes · 18 min read

Query Term Weighting Techniques for Medical Search: Statistical, Supervised, and Neural Approaches

Meituan Technology Team

Sep 24, 2020 · Artificial Intelligence

Meituan Search Ads Team's Solution for KDD Cup 2020 Multimodalities Recall Track

Meituan’s Search Ads team placed third in the KDD Cup 2020 Multimodalities Recall track by tackling training‑test distribution mismatch with diversified negative sampling and distillation learning, and improving text‑image matching via gated fully‑connected layers, bidirectional attention, and diversified fusion, then ensembling neural and tree models for strong NDCG gains later applied to their ad creative‑selection system.

DistillationInformation RetrievalKDD Cup

0 likes · 19 min read

Meituan Search Ads Team's Solution for KDD Cup 2020 Multimodalities Recall Track

DataFunTalk

Sep 16, 2020 · Artificial Intelligence

Hotspot Mining and Event Extraction in Tencent Information Flow: Methods, Framework, and Applications

This article presents Tencent's research on hotspot mining and event extraction for information flow, detailing the challenges of timeliness, comprehensiveness, and heat rationality, the combined use of time‑series analysis, topic detection, clustering, and dynamic‑time‑warping, and the resulting framework and its applications to text, image, and video recommendation.

Event ExtractionInformation RetrievalNLP

0 likes · 17 min read

Hotspot Mining and Event Extraction in Tencent Information Flow: Methods, Framework, and Applications

Swan Home Tech Team

Jul 13, 2020 · Backend Development

Design and Evolution of the DaJia App Search System

This article explains the motivations, requirements, and technical design of the DaJia app's search system, compares relational databases with Lucene‑based solutions, describes the inverted index mechanism, outlines common search workflows, and details the system's three iterative development phases and future improvement plans.

ElasticsearchInformation RetrievalLucene

0 likes · 12 min read

Design and Evolution of the DaJia App Search System

58 Tech

Jul 10, 2020 · Artificial Intelligence

Tag Mining for Used‑Car Business: NLP, Word2Vec, and Retrieval Pipeline

This article details the end‑to‑end process of extracting and leveraging tags for used‑car listings, covering data collection, segmentation, NLP‑based tokenization, word‑vector generation, tag‑library construction, and online retrieval flow to improve personalized recall and CTR.

Information RetrievalNLPTagging

0 likes · 19 min read

Tag Mining for Used‑Car Business: NLP, Word2Vec, and Retrieval Pipeline

Programmer DD

Jul 10, 2020 · Fundamentals

How Search Engines Work: Inside Document and Query Processing

This article explains the core components of a search engine—document processing, query processing, and matching—detailing each step from indexing to ranking, and discusses the document features that influence relevance, providing a comprehensive overview of information retrieval fundamentals.

Document processingInformation RetrievalQuery Processing

0 likes · 20 min read

How Search Engines Work: Inside Document and Query Processing

Alibaba Cloud Developer

Jul 1, 2020 · Artificial Intelligence

Optimizing Search Timeliness: From Feature Extraction to Ranking Models

This article explains the concept of timeliness in search ranking, defines content and demand side metrics such as half‑life and time sensitivity, describes evaluation criteria, outlines feature extraction and labeling pipelines, and details the multi‑stage modeling, recall, and indexing strategies used to improve timely search results.

Information RetrievalRanking Modelsfeature engineering

0 likes · 27 min read

Optimizing Search Timeliness: From Feature Extraction to Ranking Models

Architect

Jun 22, 2020 · Fundamentals

Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching

This article explains the core components and processing steps of a search engine—document processor, query processor, indexing, and matching—detailing how documents are normalized, tokenized, filtered, weighted, and stored in an inverted index to support effective information retrieval.

Document processingInformation RetrievalQuery Processing

0 likes · 20 min read

Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching

Youku Technology

Jun 8, 2020 · Artificial Intelligence

Video Search Technology and Multi-modal Applications at Alibaba Youku

Alibaba’s Youku video search platform combines six-layer architecture—data extraction, technology integration, recall, relevance, ranking, and intent understanding—leveraging CV, NLP, knowledge graphs, and multi‑modal cues such as face, OCR, and audio recognition to overcome title‑mismatch, entity, and semantic challenges and deliver precise, diverse video retrieval.

Information Retrievalmachine learningmulti-modal learning

0 likes · 15 min read

Video Search Technology and Multi-modal Applications at Alibaba Youku

DataFunTalk

May 21, 2020 · Artificial Intelligence

Query Expansion Techniques for Search Optimization: Models, Data Sources, and Practical Practices

This article reviews the factors influencing search results, explains why query expansion is crucial for improving recall, surveys various sources of expansion terms, describes probabilistic and translation‑based models, and offers practical recommendations for building effective, data‑driven query expansion pipelines.

Information RetrievalQuery Expansionknowledge graph

0 likes · 11 min read

Query Expansion Techniques for Search Optimization: Models, Data Sources, and Practical Practices

Meituan Technology Team

May 21, 2020 · Artificial Intelligence

AIS 2020 Conference: Schedule and Speakers for Top NLP/AI/IR Papers

The AIS 2020 Conference, co‑hosted by the Beijing Academy of Artificial Intelligence and Meituan, showcased 74 top ACL, IJCAI and SIGIR papers across 15 sessions on NLP, AI and IR topics, streamed free online on May 23‑24 2020 with keynote speakers from leading Chinese universities.

AIInformation RetrievalNLP

0 likes · 12 min read

AIS 2020 Conference: Schedule and Speakers for Top NLP/AI/IR Papers

DataFunTalk

May 16, 2020 · Artificial Intelligence

Exploring Search Matching Models and Their Applications in DiDi Food

This article introduces the background of search relevance, reviews three common matching model types—representation‑based, interaction‑based, and hybrid—describes their architectures such as DSSM, CDSSM, DRMM and DUET, and presents experimental results of these models on DiDi Food’s search system.

DiDi FoodInformation RetrievalRanking

0 likes · 15 min read

Exploring Search Matching Models and Their Applications in DiDi Food

Didi Tech

May 15, 2020 · Artificial Intelligence

Search Matching Models and Applications in DiDi Food

The article outlines DiDi Food’s search relevance challenge, defines semantic matching versus traditional keyword methods, describes the recall‑ranking pipeline, and reviews three families of deep matching models—representation‑based (e.g., DSSM), interaction‑based (e.g., DRMM) and hybrid (e.g., DUET)—including experimental results and a recruitment notice.

DiDi FoodInformation Retrievaldeep matching

0 likes · 16 min read

Search Matching Models and Applications in DiDi Food

DataFunTalk

May 7, 2020 · Artificial Intelligence

Comprehensive Overview of Query Understanding in Search Engines

Query understanding (QU) involves lexical, syntactic, and semantic analysis of user queries to enable effective search recall and ranking, covering modules such as preprocessing, correction, expansion, segmentation, intent detection, term importance, and guidance, with detailed discussion of algorithms, models, and system architecture.

Information RetrievalNLPSearch Engine

0 likes · 51 min read

Comprehensive Overview of Query Understanding in Search Engines

Meituan Technology Team

Mar 24, 2020 · Artificial Intelligence

Citation Intent Recognition: Meituan's Winning Solution in WSDM Cup 2020

Meituan’s Search & NLP team, together with two universities, won the WSDM Cup 2020 Citation Intent Recognition task by building a multimodal retrieval‑ranking pipeline that merges semantic, spatial and axiomatic recall models with pairwise BERT and LightGBM ranking, achieving the highest MAP@3 and now powering Meituan’s QA, FAQ and core search systems.

BERTCitation IntentInformation Retrieval

0 likes · 14 min read

Citation Intent Recognition: Meituan's Winning Solution in WSDM Cup 2020

DataFunTalk

Feb 3, 2020 · Artificial Intelligence

Alibaba Entertainment Search Algorithm Practice and Insights – Video Search Case Study with Youku

The live session presented Alibaba Entertainment’s senior algorithm expert discussing Youku’s video search business, relevance and ranking models, multimodal search challenges, and practical AI techniques, offering attendees a comprehensive view of modern video retrieval systems and their implementation.

AIInformation RetrievalMultimodal

0 likes · 3 min read

Alibaba Entertainment Search Algorithm Practice and Insights – Video Search Case Study with Youku

Java High-Performance Architecture

Jan 20, 2020 · Fundamentals

How Inverted Indexes Power Fast Full-Text Search

This article explains what an inverted index is, why it’s essential for full‑text search, how it is built and queried, and common token transformations such as stop‑word removal, lemmatization, and stemming.

Full-Text SearchInformation RetrievalSearch Engine

0 likes · 4 min read

How Inverted Indexes Power Fast Full-Text Search

DataFunTalk

Dec 30, 2019 · Artificial Intelligence

Technical Trends in Recommendation Systems: From Retrieval to Re‑ranking

This article surveys recent advances in recommendation system technology, covering the evolution from a two‑stage recall‑ranking pipeline to a four‑stage architecture, and detailing emerging trends in model‑based recall, user‑behavior sequence modeling, knowledge‑graph integration, graph neural networks, advanced ranking models, multi‑objective optimization, multimodal fusion, and listwise re‑ranking.

Graph Neural NetworksInformation RetrievalMulti-Task Learning

0 likes · 45 min read

Technical Trends in Recommendation Systems: From Retrieval to Re‑ranking

Architecture Digest

Nov 15, 2019 · Big Data

Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval

This article explains how 360 Search handles billions of daily crawls and hundred‑billion‑scale indexing by describing its overall architecture, core modules such as offline indexing and online retrieval, query analysis, relevance scoring, and the engineering techniques that enable efficient large‑scale web search.

Information RetrievalRankingSearch Engine

0 likes · 22 min read

Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval

vivo Internet Technology

Nov 12, 2019 · Artificial Intelligence

Elasticsearch Retrieval Optimization in Gitee: Interview with Chen Xin

In an interview, Gitee’s chief architect Chen Xin explains why Elasticsearch was chosen for code search, outlines how combining search with NLP can both aid semantic understanding and enrich repository queries, and shares his views on the platform’s fast‑evolving ecosystem and upcoming community meetup.

Code searchElasticsearchGitee

0 likes · 4 min read

Elasticsearch Retrieval Optimization in Gitee: Interview with Chen Xin

DataFunTalk

Nov 11, 2019 · Artificial Intelligence

Knowledge Graph‑Based Question Answering in Meituan’s Intelligent Interaction Scenarios

This talk presents how Meituan leverages knowledge‑graph QA (KBQA) across restricted and complex smart‑interaction scenarios, compares semantic‑parsing and information‑retrieval approaches, introduces three‑layer concept nodes to handle entity explosion and non‑connected queries, and outlines architectural refinements for multi‑turn dialogue integration.

AIDialogue SystemsInformation Retrieval

0 likes · 14 min read

Knowledge Graph‑Based Question Answering in Meituan’s Intelligent Interaction Scenarios

JD Tech Talk

Oct 30, 2019 · Artificial Intelligence

Solution Overview for the Scientific Paper Recommendation Matching Competition

This article presents a comprehensive solution to a competition that requires matching description paragraphs with the three most relevant papers from a 200,000‑paper corpus, detailing background, task definition, evaluation metrics, modeling strategy, and core algorithms such as SIF, InferSent, Bi‑LSTM, and BERT.

BERTInformation RetrievalNLP

0 likes · 9 min read

Solution Overview for the Scientific Paper Recommendation Matching Competition

DataFunTalk

Sep 5, 2019 · Artificial Intelligence

Baidu Semantic Computing: ERNIE, SimNet, and Future Directions in Natural Language Processing

This article reviews Baidu's research on semantic computing, covering the evolution of semantic representation, the development and evaluation of the ERNIE and SimNet models, their industrial applications, model compression techniques, and outlines future research priorities in multilingual and multimodal semantic understanding.

Deep LearningERNIEInformation Retrieval

0 likes · 12 min read

Baidu Semantic Computing: ERNIE, SimNet, and Future Directions in Natural Language Processing

Alibaba Cloud Developer

Jul 31, 2019 · Artificial Intelligence

How Alibaba’s Enriched BERT Set a New Record in Open‑Domain QA

Alibaba’s AI team introduced a multi‑stage, document‑ranking and paragraph‑ranking system built on an Enriched BERT model that topped the MS MARCO reading‑comprehension leaderboard, surpassing previous state‑of‑the‑art methods and even human performance on open‑domain QA tasks.

Alibaba AIEnriched BERTInformation Retrieval

0 likes · 5 min read

How Alibaba’s Enriched BERT Set a New Record in Open‑Domain QA

AntTech

Jul 21, 2019 · Artificial Intelligence

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

At SIGIR 2019 in Paris, Alipay presented two AI research papers—one applying reinforcement learning to predict user intent in customer‑service bots and another introducing the unsupervised QUEST method that builds noisy quasi‑knowledge graphs for answering complex multi‑document questions.

AIInformation Retrievalknowledge graph

0 likes · 5 min read

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

DataFunTalk

Jul 30, 2018 · Artificial Intelligence

Enhancing Automated Process Services with Multi‑Turn Dialogue: Insights from Chatopera’s NLP Solutions

The article presents a technical overview of Chatopera’s multi‑turn dialogue platform, covering language model fundamentals, Chinese segmentation, word embeddings, information retrieval, and open‑source tools, while illustrating how these AI techniques enable low‑cost, scalable enterprise chatbot solutions.

Information RetrievalNLPmulti-turn dialogue

0 likes · 11 min read

Enhancing Automated Process Services with Multi‑Turn Dialogue: Insights from Chatopera’s NLP Solutions

IT Xianyu

Apr 17, 2018 · Fundamentals

The Origins of Internet Search: Archie and WAIS

Archie, created in 1989 by Peter Deutsch at McGill University, was the first internet search tool that indexed FTP sites, while WAIS, developed by Brewster Kahle at Thinking Machines, extended searchable databases worldwide, both highlighting early challenges in managing large communication traffic and user-friendly interfaces.

ArchieInformation RetrievalInternet History

0 likes · 3 min read

The Origins of Internet Search: Archie and WAIS

Java Captain

Mar 29, 2018 · Fundamentals

Understanding Full‑Text Search and Indexing with Lucene: Core Concepts and Processes

This article explains the fundamentals of full‑text search, describing how Lucene builds and uses inverted indexes, the steps of tokenization, linguistic processing, term weighting, and relevance scoring, and illustrates these concepts with examples, tables, and diagrams.

Full-Text SearchIndexingInformation Retrieval

0 likes · 21 min read

Understanding Full‑Text Search and Indexing with Lucene: Core Concepts and Processes

Architecture Digest

Feb 1, 2018 · Fundamentals

How Search Engines Work: Building Inverted Indexes

This article explains the core of search engine technology by describing what an inverted index is, how it is built using single‑pass memory and multi‑way merge methods, how indexes can be partitioned and incrementally updated, and how Hadoop can be used for large‑scale indexing.

Big DataHadoopIndexing

0 likes · 10 min read