Tagged articles
106 articles
Page 1 of 2
DeepHub IMBA
DeepHub IMBA
Apr 30, 2026 · Artificial Intelligence

Why Real RAG Systems Need Both BM25 and Vector Search

The article analyzes how BM25 excels at exact token matching while vector embeddings capture semantic intent, explains their distinct failure modes, and shows that a hybrid retriever—combined with metadata filtering, proper chunking, and reciprocal rank fusion—delivers the most reliable results for RAG pipelines.

BM25EmbeddingHybrid Retrieval
0 likes · 17 min read
Why Real RAG Systems Need Both BM25 and Vector Search
PaperAgent
PaperAgent
Apr 27, 2026 · Artificial Intelligence

A Comprehensive Review of Modern LLM Agent Memory Frameworks

The article surveys recent LLM‑based agent memory research, presenting a unified framework that breaks memory systems into four components, detailing their design choices, experimental evaluation on LOCOMO and LONGMEMEVAL, key findings, and a new low‑token SOTA architecture.

Agent MemoryLLMMemory Management
0 likes · 8 min read
A Comprehensive Review of Modern LLM Agent Memory Frameworks
AI Explorer
AI Explorer
Apr 22, 2026 · Artificial Intelligence

How AI‑Powered TrendRadar Provides a Private, Automated Info Radar to Cut Through Noise

TrendRadar, an open‑source Python project with over 54,000 GitHub stars, combines multi‑platform aggregation, large‑model AI filtering, sentiment analysis, and multi‑channel push to deliver a private, Docker‑deployable information radar that lets users define keywords and receive concise, translated summaries in seconds.

AIDockerSentiment Analysis
0 likes · 6 min read
How AI‑Powered TrendRadar Provides a Private, Automated Info Radar to Cut Through Noise
AI Engineer Programming
AI Engineer Programming
Apr 8, 2026 · Artificial Intelligence

TF‑IDF vs BM25: Statistical Foundations of Text Retrieval for RAG

The article explains how TF‑IDF and BM25 compute term importance, compares their strengths and weaknesses, and shows how these sparse retrieval methods integrate with dense retrieval techniques such as DPR, SPLADE, and ColBERT in Retrieval‑Augmented Generation systems, concluding with a hybrid retrieval decision matrix.

BM25Hybrid RetrievalRAG
0 likes · 14 min read
TF‑IDF vs BM25: Statistical Foundations of Text Retrieval for RAG
AI Explorer
AI Explorer
Apr 2, 2026 · Artificial Intelligence

AI Agent Skill for Global Hot‑Topic Tracking and Data‑Driven Insights

The open‑source /last30days AI skill aggregates and analyzes recent hot content from Reddit, X, YouTube, Hacker News, Bluesky, Polymarket, Instagram Reels and TikTok, applying a multi‑signal quality ranking and data‑driven narrative to deliver structured, citation‑rich briefings that can be integrated into Claude Code or other workflows.

AI AgentClaude CodePolymarket
0 likes · 7 min read
AI Agent Skill for Global Hot‑Topic Tracking and Data‑Driven Insights
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 2, 2026 · Artificial Intelligence

How Smart Chunk Splitting Boosts RAG Recall from 67% to 91%

This article examines the critical role of chunk splitting in Retrieval‑Augmented Generation systems, comparing three generations of methods—from fixed‑size token cuts to sentence‑aware and semantic‑aware strategies—showing how refined chunking, overlap tuning, and metadata design raise Recall@5 from 0.67 to 0.91 while addressing table, list, and long‑section challenges.

LLMRAGchunking
0 likes · 24 min read
How Smart Chunk Splitting Boosts RAG Recall from 67% to 91%
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Mar 26, 2026 · Artificial Intelligence

Why Hybrid Retrieval Beats Pure Vector Search: BM25, RRF, and Real‑World Gains

This article explains why combining BM25 with dense vector search using Reciprocal Rank Fusion (RRF) improves recall for both exact‑term and semantic queries in a financial‑insurance document corpus, details the underlying algorithms, parameter choices such as k=60, provides Python implementations, and shows measurable performance gains in production.

BM25FAISSHybrid Retrieval
0 likes · 28 min read
Why Hybrid Retrieval Beats Pure Vector Search: BM25, RRF, and Real‑World Gains
o-ai.tech
o-ai.tech
Mar 16, 2026 · Industry Insights

Your AI Answers Could Be Shaped by Paid Brand Editing

Brands are increasingly paying to embed favorable content on platforms like Zhihu and Xiaohongshu, a practice dubbed Generative Engine Optimization (GEO), which manipulates the information AI retrieves, making many AI-generated product recommendations subtly biased without any disclosure.

AI biasGEOGenerative Engine Optimization
0 likes · 8 min read
Your AI Answers Could Be Shaped by Paid Brand Editing
PaperAgent
PaperAgent
Feb 6, 2026 · Artificial Intelligence

How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points

The paper introduces xMemory, a hierarchical "split‑aggregate‑retrieve" framework that reduces token usage by up to 30% and improves QA performance by more than 10 points in long‑range agent conversations, outperforming traditional RAG across multiple LLMs.

Agent MemoryHierarchical RetrievalLLM
0 likes · 8 min read
How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points
JD Cloud Developers
JD Cloud Developers
Feb 4, 2026 · Artificial Intelligence

How Deep Research Transforms LLMs into Autonomous AI Researchers

This article examines Deep Research, an AI system that adds autonomous planning and deep reasoning to large language models, enabling them to browse the web, perform long‑chain reasoning, and generate professional, citation‑rich reports for complex tasks such as industry trend analysis and technical competitive research.

AI researchAutonomous AgentsLLM
0 likes · 22 min read
How Deep Research Transforms LLMs into Autonomous AI Researchers
JD Tech Talk
JD Tech Talk
Feb 4, 2026 · Artificial Intelligence

How Deep Research Turns LLMs into Autonomous AI Researchers

This article explains the background, core features, underlying ReAct‑based architecture, and engineering solutions of Deep Research—a system that equips large language models with autonomous planning, long‑chain reasoning, and professional report generation to tackle complex information‑intensive tasks.

AI researchAutonomous AgentsLLM
0 likes · 21 min read
How Deep Research Turns LLMs into Autonomous AI Researchers
PaperAgent
PaperAgent
Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

LLMMulti-hop QARAG
0 likes · 7 min read
Why Traditional RAG Breaks the Chain and How SentGraph Fixes It
JD Cloud Developers
JD Cloud Developers
Nov 21, 2025 · Artificial Intelligence

Why Chunking Strategy Makes or Breaks RAG Performance

This article explains how different chunking methods—fixed size, semantic, recursive, document‑based, agent‑driven, sentence‑level, and paragraph‑level—affect Retrieval‑Augmented Generation, offering practical guidelines, metrics, and optimization tips for real‑world deployments.

AIRAGchunking
0 likes · 9 min read
Why Chunking Strategy Makes or Breaks RAG Performance
Xuanwu Backend Tech Stack
Xuanwu Backend Tech Stack
Oct 22, 2025 · Artificial Intelligence

How Rerank Transforms Retrieval‑Augmented Generation for Accurate AI Answers

This article explains the limitations of basic Retrieval‑Augmented Generation (RAG), introduces Rerank technology as a two‑step refinement process, compares dual‑encoder and cross‑encoder methods, and reviews popular Rerank models to help developers build more precise AI‑driven retrieval systems.

Artificial IntelligenceRAGRerank
0 likes · 10 min read
How Rerank Transforms Retrieval‑Augmented Generation for Accurate AI Answers
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 1, 2025 · Artificial Intelligence

Mastering RAG: From Chunking to Hybrid Search for Better AI Retrieval

This article delves into the implementation details and optimization strategies of Retrieval‑Augmented Generation (RAG), covering document chunking, index enhancement, embedding, hybrid search, and re‑ranking, and provides practical code examples to help developers move from quick deployment to deep performance tuning.

AIEmbeddingHybrid Search
0 likes · 19 min read
Mastering RAG: From Chunking to Hybrid Search for Better AI Retrieval
Baidu Geek Talk
Baidu Geek Talk
Apr 7, 2025 · Artificial Intelligence

COBRA: Unified Generative Recommendations with Cascaded Sparse-Dense Representations

COBRA, Baidu’s new generative retrieval framework, unifies sparse ID generation and dense vector encoding through a cascaded architecture that first predicts hierarchical IDs then refines them into dense representations, achieving state‑of‑the‑art recall, NDCG and conversion gains across public benchmarks and large‑scale advertising production.

AICOBRAGenerative Recommendation
0 likes · 13 min read
COBRA: Unified Generative Recommendations with Cascaded Sparse-Dense Representations
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 25, 2025 · Artificial Intelligence

Boost Your AI Search Skills: Advanced Prompt & Query Tricks

This guide explains how to leverage AI tools with deep web‑search capabilities, covering site‑specific queries, wildcard operators, date ranges, Boolean logic, and effective prompt engineering techniques—including Socratic questioning and CRISPE framework—to improve information retrieval accuracy and efficiency across various domains.

AILarge Language ModelsSearch Operators
0 likes · 8 min read
Boost Your AI Search Skills: Advanced Prompt & Query Tricks
Architect
Architect
Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI reliabilityLLMRAG
0 likes · 32 min read
Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems
Baidu Tech Salon
Baidu Tech Salon
Mar 21, 2025 · Artificial Intelligence

Semantic Embedding with Large Language Models: A Comprehensive Survey

This survey reviews the evolution of semantic embedding—from Word2vec and GloVe to BERT, Sentence‑BERT, and recent contrastive methods—then examines how large language models improve embeddings via synthetic data generation and backbone architectures, detailing techniques such as contrastive prompting, in‑context learning, knowledge distillation, and discussing resource, privacy, and interpretability challenges.

In-Context LearningNLPcontrastive learning
0 likes · 27 min read
Semantic Embedding with Large Language Models: A Comprehensive Survey
JD Tech
JD Tech
Feb 5, 2025 · Artificial Intelligence

Tech Insight: Highlights of Ten JD Retail Technology Papers Published in Top AI Conferences (2024)

Tech Insight presents concise overviews of ten JD retail technology papers accepted at top AI conferences in 2024, covering topics such as open‑vocabulary object detection, multi‑scenario ranking, diversity‑aware re‑ranking, a diversified product search dataset, semi‑supervised query classification, plug‑in CTR models, and methods to mitigate LLM hallucinations.

AIComputer Visione‑commerce
0 likes · 17 min read
Tech Insight: Highlights of Ten JD Retail Technology Papers Published in Top AI Conferences (2024)
Baidu Tech Salon
Baidu Tech Salon
Jan 21, 2025 · Artificial Intelligence

How AI Is Transforming Legal Research: Inside the YuanDian WenDa Smart Q&A Engine

Faced with billions of legal documents and the shortcomings of keyword search, Chinese legal professionals are turning to the AI‑powered YuanDian WenDa engine, which leverages Baidu's Wenxin model, a structured legal database, and prompt‑engineering to deliver trustworthy, citation‑rich answers and rapid research reports.

AILegalTechProduct Development
0 likes · 10 min read
How AI Is Transforming Legal Research: Inside the YuanDian WenDa Smart Q&A Engine
JD Retail Technology
JD Retail Technology
Jan 21, 2025 · Artificial Intelligence

Tech Insight: Selected JD Retail Technology Papers in Artificial Intelligence (2024)

Tech Insight highlights ten 2024 JD Retail Technology AI papers presented at top conferences—including CVPR, SIGIR, WWW, AAAI and IJCAI—that advance open‑vocabulary object detection, unified search‑recommendation, pre‑ranking consistency, diversity‑aware re‑ranking, a diversified product‑search dataset, graph‑based query classification, plug‑in CTR models, parallel ad‑ranking, trajectory‑based CTR stability, and task‑aware decoding for large language models.

Artificial IntelligenceCTR predictionComputer Vision
0 likes · 20 min read
Tech Insight: Selected JD Retail Technology Papers in Artificial Intelligence (2024)
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Dec 30, 2024 · Artificial Intelligence

How RAG Fusion Revolutionizes Information Retrieval: Mechanisms, Benefits, and Future Directions

This article examines RAG Fusion, a retrieval‑augmented generation technique that combines multi‑query generation, reciprocal rank fusion, and contextual relevance improvements to boost search accuracy, discusses its workflow, mathematical foundation, advantages, challenges, real‑world applications, and emerging research directions.

AIRAG FusionReciprocal Rank Fusion
0 likes · 15 min read
How RAG Fusion Revolutionizes Information Retrieval: Mechanisms, Benefits, and Future Directions
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 18, 2024 · Artificial Intelligence

How STAR Enables Training‑Free Recommendations with Large Language Models

The article reviews the STAR framework, a training‑free recommendation approach that leverages large language model embeddings and collaborative co‑occurrence scores to retrieve and rank items, and evaluates its performance, hyper‑parameter effects, and ablation studies against existing LLM‑based recommender methods.

Artificial IntelligenceLLMRecommendation Systems
0 likes · 10 min read
How STAR Enables Training‑Free Recommendations with Large Language Models
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 18, 2024 · Artificial Intelligence

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

This article shares a half‑year of hands‑on experience with Retrieval‑Augmented Generation, analyzing why simple RAG setups often feel unintelligent, identifying three core knowledge issues, and presenting concrete optimization strategies—including chunking, knowledge expansion, and tag‑based conflict resolution—to improve retrieval and generation performance in low‑resource environments.

AILarge Language ModelsRAG
0 likes · 25 min read
Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 12, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

ChatDBA is a conversational AI system built by Shanghai Aikesheng that employs large language models and Retrieval‑Augmented Generation to help database administrators diagnose faults, learn domain knowledge, and generate or optimize SQL, with a redesigned architecture that addresses early‑stage shortcomings and outlines future enhancements.

ChatDBAFault DiagnosisKnowledge Base
0 likes · 10 min read
ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 10, 2024 · Artificial Intelligence

Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation

This article reviews the ACL 2024 paper that investigates how large language model‑generated text influences retrieval‑augmented generation pipelines, revealing short‑term retrieval gains but a long‑term “spiral of silence” that marginalizes human‑generated content and homogenizes open‑domain QA results.

AI ImpactLLMOpen Domain QA
0 likes · 9 min read
Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jul 29, 2024 · Artificial Intelligence

Scaling Laws for Dense Retrieval: Empirical Study of Model Size, Training Data, and Annotation Quality

The award‑winning study shows that dense retrieval performance follows precise power‑law scaling with model size, training data quantity, and annotation quality, introduces contrast entropy for evaluation, validates joint scaling formulas on MS MARCO and T2Ranking, and uses cost models to guide budget‑optimal resource allocation.

Model Sizeannotation qualitycontrast entropy
0 likes · 13 min read
Scaling Laws for Dense Retrieval: Empirical Study of Model Size, Training Data, and Annotation Quality
Meituan Technology Team
Meituan Technology Team
Jun 27, 2024 · Artificial Intelligence

Meituan Technical Team's Three Papers Accepted at SIGIR 2024: Ad Auction Integration, Federated Recommendation, and POI Recommendation

The article highlights three Meituan research papers accepted at SIGIR 2024—covering deep automated mechanism design for ad auction, a retrieval‑enhanced vertical federated recommendation framework, and disentangled contrastive hypergraph learning for next POI recommendation—and announces an online sharing event where the authors will present their work.

AI researchAd AuctionFederated Recommendation
0 likes · 9 min read
Meituan Technical Team's Three Papers Accepted at SIGIR 2024: Ad Auction Integration, Federated Recommendation, and POI Recommendation
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Apr 28, 2024 · Artificial Intelligence

Generative Dense Retrieval: Memory Can Be a Burden

The paper introduces Generative Dense Retrieval (GDR), a two‑stage retrieval framework that first maps queries to memory‑efficient document‑cluster identifiers and then uses dense vectors to locate individual documents, achieving higher recall and better scalability than traditional generative retrieval while incurring modest latency and capacity trade‑offs.

Memory Mechanismgenerative dense retrievalinformation retrieval
0 likes · 13 min read
Generative Dense Retrieval: Memory Can Be a Burden
DataFunTalk
DataFunTalk
Mar 15, 2024 · Artificial Intelligence

Application of Agent Technology in Voice Assistant Scenarios

Senior algorithm engineer Qi Jianwei from Xiaomi presents a comprehensive overview of building a large‑model‑centric Agent framework for voice assistants, covering prompt design, information retrieval, RAG processes, and future optimization directions to enhance performance and stability.

Prompt engineeringVoice Assistantagent
0 likes · 2 min read
Application of Agent Technology in Voice Assistant Scenarios
Ops Development & AI Practice
Ops Development & AI Practice
Mar 14, 2024 · Artificial Intelligence

Do Vector Embeddings Offer the Same Consistency as Hash Functions?

While both vectorization and hashing are essential for handling large datasets, this article examines whether vector embeddings can match the deterministic consistency of hash functions, comparing their collision handling, data structure design implications, and suitability for retrieval and machine‑learning tasks.

AIConsistencyHashing
0 likes · 8 min read
Do Vector Embeddings Offer the Same Consistency as Hash Functions?
Ops Development & AI Practice
Ops Development & AI Practice
Mar 13, 2024 · Artificial Intelligence

How Vector Retrieval Powers AI Model Training and Real-World Applications

Vector retrieval, based on converting data into high‑dimensional vectors and measuring similarity, enables fast, accurate search across massive datasets, supporting AI tasks such as search engines, recommendation, NLP, and computer vision, and plays a crucial role in large‑model training for data selection, anomaly detection, and model optimization.

AI trainingRecommendation SystemsVector Retrieval
0 likes · 6 min read
How Vector Retrieval Powers AI Model Training and Real-World Applications
php Courses
php Courses
Feb 18, 2024 · Backend Development

Implementing Information Retrieval and SEO with PHP

This article explains the fundamentals of information retrieval and search engine optimization and provides practical PHP code examples for keyword search, full‑text search, and common SEO techniques such as keyword, internal, and external link optimization.

SEOWeb Optimizationinformation retrieval
0 likes · 7 min read
Implementing Information Retrieval and SEO with PHP
政采云技术
政采云技术
Dec 19, 2023 · Backend Development

Principles and Simple Implementation of a Search Engine in Go

This article explains the fundamental concepts of search engine technology—including forward and inverted indexes, tokenizers, stop words, synonym handling, ranking algorithms, and NLP integration—and provides a concise Go implementation with code examples and performance testing.

GoNLPTokenizer
0 likes · 21 min read
Principles and Simple Implementation of a Search Engine in Go
php Courses
php Courses
Aug 31, 2023 · Backend Development

Implementing Information Retrieval and SEO with PHP

This article explains the fundamentals of information retrieval and search engine optimization, demonstrating how to implement keyword and full‑text search using PHP and MySQL, and presenting practical PHP techniques for keyword, internal, and external link optimization to improve website visibility.

SEObackend-developmentinformation retrieval
0 likes · 6 min read
Implementing Information Retrieval and SEO with PHP
JD Cloud Developers
JD Cloud Developers
Aug 22, 2023 · Artificial Intelligence

A Practical Guide to Recommendation System Architecture and Methods

This article provides a concise overview of recommendation systems, covering their definition, core framework of recall, ranking, and re‑ranking, various recall strategies including multi‑path and vector‑based methods, similarity calculations, and practical implementation details such as AB testing and code examples.

AB testingVector Embeddinginformation retrieval
0 likes · 14 min read
A Practical Guide to Recommendation System Architecture and Methods
Architect
Architect
May 29, 2023 · Artificial Intelligence

Understanding Embeddings and Vector Databases for LLM Applications

This article explains what embeddings and vector databases are, how they are generated with models like OpenAI's Ada, why they enable semantic search and help overcome large language model token limits, and demonstrates a practical workflow for retrieving relevant document chunks using cosine similarity.

LLMembeddingsinformation retrieval
0 likes · 7 min read
Understanding Embeddings and Vector Databases for LLM Applications
Baidu Geek Talk
Baidu Geek Talk
Mar 13, 2023 · Artificial Intelligence

Recent Advances in Sparse and Dense Retrieval for Search Engines

The article surveys recent academic advances in both sparse inverted‑index and dense semantic retrieval for large‑scale search, highlighting key papers on decision frameworks, benchmarks, sparse lexical models, dual encoders, and hybrid techniques, while discussing challenges such as single‑vector limits and proposing multi‑view and hybrid solutions.

dense retrievalinformation retrievalpretraining
0 likes · 12 min read
Recent Advances in Sparse and Dense Retrieval for Search Engines
DataFunTalk
DataFunTalk
Jan 18, 2023 · Artificial Intelligence

Search Relevance System Architecture and Practices in QQ Browser

This article presents the QQ Browser search relevance team's experience integrating QQ Browser and Sogou search systems, detailing business overview, relevance system evolution, algorithm architecture, evaluation metrics, deep semantic matching, relevance calibration, and model distillation techniques to improve search relevance performance.

Evaluation Metricsinformation retrievalmodel distillation
0 likes · 31 min read
Search Relevance System Architecture and Practices in QQ Browser
Tencent Cloud Developer
Tencent Cloud Developer
Jan 9, 2023 · Artificial Intelligence

Search Relevance Architecture and Practices in QQ Browser

The QQ Browser search relevance team describes a unified, billion‑scale architecture that combines a main and vertical subsystem, a pyramid‑shaped ranking pipeline (recall, coarse, fine), a dedicated GPU‑accelerated relevance service, and hybrid semantic‑matching models (dual‑tower, BERT, matrix fusion) evaluated with offline and online metrics to deliver accurate, fresh, and authoritative results for diverse content and long‑tail queries.

Deep LearningEvaluation MetricsSystem Architecture
0 likes · 28 min read
Search Relevance Architecture and Practices in QQ Browser
IT Services Circle
IT Services Circle
Jan 9, 2023 · Fundamentals

11 Google Search Techniques to Find Information Faster

This article presents eleven practical Google search tricks—including keyword matching, exact phrases, site‑specific queries, file‑type filters, and time ranges—to help programmers and other users retrieve relevant information more efficiently and improve overall productivity.

GoogleSearch TipsWeb
0 likes · 6 min read
11 Google Search Techniques to Find Information Faster
Su San Talks Tech
Su San Talks Tech
Jan 8, 2023 · Fundamentals

11 Powerful Google Search Tricks to Find Information Faster

Discover eleven practical Google search techniques—from using spaces, vertical bars, and quotes to applying wildcards, site filters, filetype limits, and time ranges—that help programmers and anyone else locate precise information quickly and efficiently.

Tipsgoogle searchinformation retrieval
0 likes · 6 min read
11 Powerful Google Search Tricks to Find Information Faster
Alimama Tech
Alimama Tech
Nov 9, 2022 · Artificial Intelligence

Graph-based Weakly Supervised Framework for Semantic Relevance Learning in E-commerce

The paper introduces a graph‑based weakly supervised contrastive learning framework that uses heterogeneous user‑behavior graphs, e‑commerce‑specific augmentations, and a hybrid fine‑tuning/transfer learning strategy to improve semantic relevance matching between queries and product titles, achieving significant gains on a large‑scale Taobao dataset.

Weak Supervisioncontrastive learninge‑commerce
0 likes · 12 min read
Graph-based Weakly Supervised Framework for Semantic Relevance Learning in E-commerce
Meituan Technology Team
Meituan Technology Team
Jul 21, 2022 · Artificial Intelligence

Overview of Meituan Technical Team Papers Featured at ACM SIGIR 2022 and Related Works

The article highlights ten representative Meituan technical papers accepted at ACM SIGIR 2022, spanning personalized opinion tagging, cross‑domain sentiment classification, dialogue summarization transfer, universal retrieval, CTR prediction, image behavior modeling, and topic segmentation, each summarized with abstracts and download links for researchers.

Recommendation Systemscross-domain learninginformation retrieval
0 likes · 25 min read
Overview of Meituan Technical Team Papers Featured at ACM SIGIR 2022 and Related Works
Hulu Beijing
Hulu Beijing
May 26, 2022 · Artificial Intelligence

Why Vector Retrieval Outperforms Keyword Search for Personalized Video Discovery

This article explains how modern video platforms combine traditional keyword retrieval with deep‑learning‑based vector retrieval, detailing model architectures, attention mechanisms, personalization features, offline experiments, and online A/B results that show significant improvements in recall, relevance, and user experience.

Deep LearningVector Retrievalinformation retrieval
0 likes · 18 min read
Why Vector Retrieval Outperforms Keyword Search for Personalized Video Discovery
Hulu Beijing
Hulu Beijing
May 18, 2022 · Artificial Intelligence

How Hulu Optimizes Video Search for TV Remotes and Short Queries

This article examines Hulu's video search engine, highlighting challenges such as ensuring relevance beyond text matching, handling ultra‑short queries on TV remotes, addressing content gaps, and integrating AI‑driven query understanding, retrieval, and ranking to improve user experience.

HuluQuery Understandinginformation retrieval
0 likes · 7 min read
How Hulu Optimizes Video Search for TV Remotes and Short Queries
Alimama Tech
Alimama Tech
Apr 6, 2022 · Artificial Intelligence

Alibaba's Five Papers Accepted at SIGIR 2022

Alibaba’s research team had five papers accepted at the prestigious SIGIR 2022 conference in Madrid, covering innovations such as joint ad‑ranking and creative selection, personalized bundle generation, calibrated neural predictions, disentangled counterfactual regression, and cold‑start user recommendation, showcasing strong expertise in information retrieval and online advertising.

CalibrationRecommendation SystemsSIGIR 2022
0 likes · 8 min read
Alibaba's Five Papers Accepted at SIGIR 2022
DataFunTalk
DataFunTalk
Mar 16, 2022 · Artificial Intelligence

A Survey of Entity Linking: Definitions, Methods, and Applications

This article provides a comprehensive overview of entity linking, detailing its definition, the two-stage pipeline of entity recognition and disambiguation, common methodologies such as candidate generation and ranking, advanced approaches, challenges like unlinkable mentions, and various applications in knowledge graphs, text mining, and question answering.

entity linkinginformation retrievalnatural language processing
0 likes · 15 min read
A Survey of Entity Linking: Definitions, Methods, and Applications
Baidu Geek Talk
Baidu Geek Talk
Nov 29, 2021 · Artificial Intelligence

Pretrained Models for First-Stage Information Retrieval: A Comprehensive Review

This comprehensive review by Dr. Fan Yixing surveys how pretrained language models have transformed first‑stage information retrieval, tracing the shift from traditional term‑based methods to neural sparse, dense, and hybrid approaches, and discussing key challenges such as hard‑negative mining, joint indexing‑representation learning, and generative‑discriminative training.

Hybrid RetrievalNeural IRSparse Retrieval
0 likes · 15 min read
Pretrained Models for First-Stage Information Retrieval: A Comprehensive Review
ByteDance SE Lab
ByteDance SE Lab
Oct 29, 2021 · Artificial Intelligence

What Is a Knowledge Graph? From Basics to Embedding Techniques

This article introduces knowledge graphs, defining them as semantic networks or multi‑relational graphs, explains entities and relations, compares RDF and graph‑database storage, outlines construction steps including entity extraction and ontology building, reviews embedding models like TransE/H/R/D, and explores applications in search, finance, recommendation, and language models.

AIgraph embeddinginformation retrieval
0 likes · 22 min read
What Is a Knowledge Graph? From Basics to Embedding Techniques
DataFunTalk
DataFunTalk
Sep 24, 2021 · Artificial Intelligence

Intelligent Question Answering in QQ Browser Search Engine: KBQA, DeepQA, and IRQA

This article presents the architecture, techniques, and practical solutions behind intelligent question answering in QQ Browser's search engine, covering knowledge‑graph based QA (KBQA), machine‑reading‑comprehension QA (DeepQA), and information‑retrieval QA (IRQA), and discusses system design, model optimization, and future directions.

AIinformation retrievalknowledge graph
0 likes · 23 min read
Intelligent Question Answering in QQ Browser Search Engine: KBQA, DeepQA, and IRQA
DataFunTalk
DataFunTalk
Sep 3, 2021 · Artificial Intelligence

Construction and Application of an Interest Point Graph for Content Understanding in Information Feed Recommendation

This article explains how large‑scale UGC data is used to build a multi‑type interest point graph, describes the mining, hierarchical and associative relationship extraction methods, and demonstrates how the graph improves content understanding and recommendation accuracy while mitigating filter‑bubble effects.

Artificial IntelligenceRecommendation Systemscontent understanding
0 likes · 25 min read
Construction and Application of an Interest Point Graph for Content Understanding in Information Feed Recommendation
DataFunTalk
DataFunTalk
Aug 2, 2021 · Databases

From Text Search to Vector Search: Generalizing Unstructured Data Retrieval

The article explains why traditional text‑based search engines like ElasticSearch struggle with modern multimodal data, introduces vector databases that store implicit semantic embeddings, and proposes a generalized search architecture that decouples data‑to‑vector mapping from the engine while leveraging clustering or graph indexes for similarity search.

AIEmbeddinginformation retrieval
0 likes · 12 min read
From Text Search to Vector Search: Generalizing Unstructured Data Retrieval
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 30, 2021 · Artificial Intelligence

iQIYI Search Ranking Algorithm Practice – NLP and Search Integration

At iQIYI’s iTech Conference, Zhang Zhigang detailed a full‑stack search ranking system that combines NLP‑driven query analysis, hierarchical indexing, multi‑stage coarse‑to‑fine ranking, Transformer‑based re‑ranking, sparse‑feature DNN enhancements and LIME/SE‑Block explainability, delivering measurable gains in CTR and NDCG for the platform’s video search.

NLPiQIYIinformation retrieval
0 likes · 20 min read
iQIYI Search Ranking Algorithm Practice – NLP and Search Integration
We-Design
We-Design
May 31, 2021 · Product Management

Mastering Search Design: 5 Essential Stages for Better User Experiences

This article breaks down the evolving problem space of search and walks through its five core stages—request acquisition, parsing, matching, ranking, and result presentation—offering practical design decisions and best‑practice tips to create more effective search experiences.

Product DesignUI/UXinformation retrieval
0 likes · 21 min read
Mastering Search Design: 5 Essential Stages for Better User Experiences
DataFunSummit
DataFunSummit
Apr 8, 2021 · Artificial Intelligence

Evaluation Metrics and Methods for Recommendation Systems

This article explains the purpose, dimensions, and specific quantitative metrics—such as accuracy, surprise, diversity, RMSE, MAE, R‑squared, MAP, MRR, ROC and AUC—used to evaluate recommendation systems, covering user, platform, item, and system perspectives for practical AI deployments.

Evaluation Metricsinformation retrieval
0 likes · 13 min read
Evaluation Metrics and Methods for Recommendation Systems
58 Tech
58 Tech
Mar 29, 2021 · Artificial Intelligence

Deep Semantic Model Exploration and Application in 58 Search

This article presents a comprehensive overview of 58 Search's multi‑stage retrieval system, compares term‑match and semantic matching, details the design, training, and optimization of interactive, dual‑tower, and semi‑interactive BERT‑based semantic models, and discusses their practical deployment in ranking and recall stages.

AIBERTdual-tower
0 likes · 18 min read
Deep Semantic Model Exploration and Application in 58 Search
DataFunTalk
DataFunTalk
Dec 14, 2020 · Artificial Intelligence

Query Expansion Techniques: Relevance Modeling vs. Generative Approaches and Future Directions

This article reviews current query expansion methods, contrasting relevance‑based models that rely on terms or entities with generative models that encode whole queries, discusses challenges of handling long and complex queries, and surveys recent research on encoding queries, session modeling, and multi‑task feature integration.

Generative ModelsNLPinformation retrieval
0 likes · 9 min read
Query Expansion Techniques: Relevance Modeling vs. Generative Approaches and Future Directions
DeWu Technology
DeWu Technology
Dec 4, 2020 · Fundamentals

Introduction to Search Engine Technology and Information Retrieval

The article surveys core search‑engine technology—document hierarchy, flat and vertical inverted indexes, query operators for building and merging score lists, and ranking models from Boolean and BM25 to language‑model approaches like Indri—providing a foundational overview of information retrieval.

BM25information retrievalinverted index
0 likes · 14 min read
Introduction to Search Engine Technology and Information Retrieval
DataFunTalk
DataFunTalk
Nov 16, 2020 · Artificial Intelligence

Deep Semantic Relevance and Multimodal Video Search at Alibaba Entertainment

The presentation by Alibaba Entertainment's senior algorithm expert details the challenges of video search in the 4G/5G era and describes a comprehensive framework covering business overview, relevance and ranking, multimodal retrieval, deep semantic modeling, dataset construction, and practical deployment techniques.

Deep LearningMultimodalinformation retrieval
0 likes · 27 min read
Deep Semantic Relevance and Multimodal Video Search at Alibaba Entertainment
DataFunTalk
DataFunTalk
Nov 4, 2020 · Artificial Intelligence

Intelligent E‑commerce Search: Architecture, Techniques, and Real‑World Impact

This article explores the evolution of e‑commerce search, detailing why search matters, the technical pipeline—including query preprocessing, entity and intent recognition, knowledge‑graph construction, recall, coarse and fine ranking—and demonstrates substantial performance gains through real‑world case studies.

AISearche‑commerce
0 likes · 16 min read
Intelligent E‑commerce Search: Architecture, Techniques, and Real‑World Impact
ITPUB
ITPUB
Oct 23, 2020 · Fundamentals

How General Search Engines Work: From Crawlers to Ranking

This article provides a comprehensive overview of general search engines, covering their classification, core workflow, key modules such as web crawlers, content processing, storage, user query handling, ranking strategies like TF‑IDF and PageRank, as well as anti‑cheat measures and user intent understanding.

PageRankTF-IDFWeb Crawling
0 likes · 16 min read
How General Search Engines Work: From Crawlers to Ranking
DataFunTalk
DataFunTalk
Oct 13, 2020 · Artificial Intelligence

Query Term Weighting Techniques for Medical Search: Statistical, Supervised, and Neural Approaches

This article reviews the challenges of short‑text query understanding in medical search and surveys a range of term‑weighting methods—including statistical models, supervised weighting, knowledge‑graph‑enhanced extraction, and neural network‑based approaches—highlighting their assumptions, implementations, and practical considerations for improving retrieval relevance.

information retrievalknowledge graphmedical search
0 likes · 18 min read
Query Term Weighting Techniques for Medical Search: Statistical, Supervised, and Neural Approaches
Meituan Technology Team
Meituan Technology Team
Sep 24, 2020 · Artificial Intelligence

Meituan Search Ads Team's Solution for KDD Cup 2020 Multimodalities Recall Track

Meituan’s Search Ads team placed third in the KDD Cup 2020 Multimodalities Recall track by tackling training‑test distribution mismatch with diversified negative sampling and distillation learning, and improving text‑image matching via gated fully‑connected layers, bidirectional attention, and diversified fusion, then ensembling neural and tree models for strong NDCG gains later applied to their ad creative‑selection system.

DistillationKDD CupMultimodal Learning
0 likes · 19 min read
Meituan Search Ads Team's Solution for KDD Cup 2020 Multimodalities Recall Track
DataFunTalk
DataFunTalk
Sep 16, 2020 · Artificial Intelligence

Hotspot Mining and Event Extraction in Tencent Information Flow: Methods, Framework, and Applications

This article presents Tencent's research on hotspot mining and event extraction for information flow, detailing the challenges of timeliness, comprehensiveness, and heat rationality, the combined use of time‑series analysis, topic detection, clustering, and dynamic‑time‑warping, and the resulting framework and its applications to text, image, and video recommendation.

Event ExtractionNLPTime Series Analysis
0 likes · 17 min read
Hotspot Mining and Event Extraction in Tencent Information Flow: Methods, Framework, and Applications
Swan Home Tech Team
Swan Home Tech Team
Jul 13, 2020 · Backend Development

Design and Evolution of the DaJia App Search System

This article explains the motivations, requirements, and technical design of the DaJia app's search system, compares relational databases with Lucene‑based solutions, describes the inverted index mechanism, outlines common search workflows, and details the system's three iterative development phases and future improvement plans.

BackendElasticsearchSearch
0 likes · 12 min read
Design and Evolution of the DaJia App Search System
58 Tech
58 Tech
Jul 10, 2020 · Artificial Intelligence

Tag Mining for Used‑Car Business: NLP, Word2Vec, and Retrieval Pipeline

This article details the end‑to‑end process of extracting and leveraging tags for used‑car listings, covering data collection, segmentation, NLP‑based tokenization, word‑vector generation, tag‑library construction, and online retrieval flow to improve personalized recall and CTR.

NLPTaggingWord2Vec
0 likes · 19 min read
Tag Mining for Used‑Car Business: NLP, Word2Vec, and Retrieval Pipeline
Programmer DD
Programmer DD
Jul 10, 2020 · Fundamentals

How Search Engines Work: Inside Document and Query Processing

This article explains the core components of a search engine—document processing, query processing, and matching—detailing each step from indexing to ranking, and discusses the document features that influence relevance, providing a comprehensive overview of information retrieval fundamentals.

Document ProcessingQuery Processinginformation retrieval
0 likes · 20 min read
How Search Engines Work: Inside Document and Query Processing
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 1, 2020 · Artificial Intelligence

Optimizing Search Timeliness: From Feature Extraction to Ranking Models

This article explains the concept of timeliness in search ranking, defines content and demand side metrics such as half‑life and time sensitivity, describes evaluation criteria, outlines feature extraction and labeling pipelines, and details the multi‑stage modeling, recall, and indexing strategies used to improve timely search results.

Ranking Modelsfeature engineeringinformation retrieval
0 likes · 27 min read
Optimizing Search Timeliness: From Feature Extraction to Ranking Models
Architect
Architect
Jun 22, 2020 · Fundamentals

Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching

This article explains the core components and processing steps of a search engine—document processor, query processor, indexing, and matching—detailing how documents are normalized, tokenized, filtered, weighted, and stored in an inverted index to support effective information retrieval.

Document ProcessingQuery Processinginformation retrieval
0 likes · 20 min read
Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching
Youku Technology
Youku Technology
Jun 8, 2020 · Artificial Intelligence

Video Search Technology and Multi-modal Applications at Alibaba Youku

Alibaba’s Youku video search platform combines six-layer architecture—data extraction, technology integration, recall, relevance, ranking, and intent understanding—leveraging CV, NLP, knowledge graphs, and multi‑modal cues such as face, OCR, and audio recognition to overcome title‑mismatch, entity, and semantic challenges and deliver precise, diverse video retrieval.

information retrievalmachine learningmulti-modal learning
0 likes · 15 min read
Video Search Technology and Multi-modal Applications at Alibaba Youku
DataFunTalk
DataFunTalk
May 21, 2020 · Artificial Intelligence

Query Expansion Techniques for Search Optimization: Models, Data Sources, and Practical Practices

This article reviews the factors influencing search results, explains why query expansion is crucial for improving recall, surveys various sources of expansion terms, describes probabilistic and translation‑based models, and offers practical recommendations for building effective, data‑driven query expansion pipelines.

information retrievalknowledge graphmachine learning
0 likes · 11 min read
Query Expansion Techniques for Search Optimization: Models, Data Sources, and Practical Practices
Meituan Technology Team
Meituan Technology Team
May 21, 2020 · Artificial Intelligence

AIS 2020 Conference: Schedule and Speakers for Top NLP/AI/IR Papers

The AIS 2020 Conference, co‑hosted by the Beijing Academy of Artificial Intelligence and Meituan, showcased 74 top ACL, IJCAI and SIGIR papers across 15 sessions on NLP, AI and IR topics, streamed free online on May 23‑24 2020 with keynote speakers from leading Chinese universities.

AINLPconference
0 likes · 12 min read
AIS 2020 Conference: Schedule and Speakers for Top NLP/AI/IR Papers
DataFunTalk
DataFunTalk
May 16, 2020 · Artificial Intelligence

Exploring Search Matching Models and Their Applications in DiDi Food

This article introduces the background of search relevance, reviews three common matching model types—representation‑based, interaction‑based, and hybrid—describes their architectures such as DSSM, CDSSM, DRMM and DUET, and presents experimental results of these models on DiDi Food’s search system.

DiDi FoodNeural Networksdeep matching
0 likes · 15 min read
Exploring Search Matching Models and Their Applications in DiDi Food
Didi Tech
Didi Tech
May 15, 2020 · Artificial Intelligence

Search Matching Models and Applications in DiDi Food

The article outlines DiDi Food’s search relevance challenge, defines semantic matching versus traditional keyword methods, describes the recall‑ranking pipeline, and reviews three families of deep matching models—representation‑based (e.g., DSSM), interaction‑based (e.g., DRMM) and hybrid (e.g., DUET)—including experimental results and a recruitment notice.

DiDi Fooddeep matchinginformation retrieval
0 likes · 16 min read
Search Matching Models and Applications in DiDi Food
DataFunTalk
DataFunTalk
May 7, 2020 · Artificial Intelligence

Comprehensive Overview of Query Understanding in Search Engines

Query understanding (QU) involves lexical, syntactic, and semantic analysis of user queries to enable effective search recall and ranking, covering modules such as preprocessing, correction, expansion, segmentation, intent detection, term importance, and guidance, with detailed discussion of algorithms, models, and system architecture.

NLPQuery Understandinginformation retrieval
0 likes · 51 min read
Comprehensive Overview of Query Understanding in Search Engines
Meituan Technology Team
Meituan Technology Team
Mar 24, 2020 · Artificial Intelligence

Citation Intent Recognition: Meituan's Winning Solution in WSDM Cup 2020

Meituan’s Search & NLP team, together with two universities, won the WSDM Cup 2020 Citation Intent Recognition task by building a multimodal retrieval‑ranking pipeline that merges semantic, spatial and axiomatic recall models with pairwise BERT and LightGBM ranking, achieving the highest MAP@3 and now powering Meituan’s QA, FAQ and core search systems.

BERTCitation IntentLightGBM
0 likes · 14 min read
Citation Intent Recognition: Meituan's Winning Solution in WSDM Cup 2020
DataFunTalk
DataFunTalk
Feb 3, 2020 · Artificial Intelligence

Alibaba Entertainment Search Algorithm Practice and Insights – Video Search Case Study with Youku

The live session presented Alibaba Entertainment’s senior algorithm expert discussing Youku’s video search business, relevance and ranking models, multimodal search challenges, and practical AI techniques, offering attendees a comprehensive view of modern video retrieval systems and their implementation.

AIMultimodalSearch Algorithms
0 likes · 3 min read
Alibaba Entertainment Search Algorithm Practice and Insights – Video Search Case Study with Youku
DataFunTalk
DataFunTalk
Dec 30, 2019 · Artificial Intelligence

Technical Trends in Recommendation Systems: From Retrieval to Re‑ranking

This article surveys recent advances in recommendation system technology, covering the evolution from a two‑stage recall‑ranking pipeline to a four‑stage architecture, and detailing emerging trends in model‑based recall, user‑behavior sequence modeling, knowledge‑graph integration, graph neural networks, advanced ranking models, multi‑objective optimization, multimodal fusion, and listwise re‑ranking.

Recommendation Systemsgraph neural networksinformation retrieval
0 likes · 45 min read
Technical Trends in Recommendation Systems: From Retrieval to Re‑ranking
Architecture Digest
Architecture Digest
Nov 15, 2019 · Big Data

Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval

This article explains how 360 Search handles billions of daily crawls and hundred‑billion‑scale indexing by describing its overall architecture, core modules such as offline indexing and online retrieval, query analysis, relevance scoring, and the engineering techniques that enable efficient large‑scale web search.

information retrievallarge-scale indexingranking
0 likes · 22 min read
Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval
vivo Internet Technology
vivo Internet Technology
Nov 12, 2019 · Artificial Intelligence

Elasticsearch Retrieval Optimization in Gitee: Interview with Chen Xin

In an interview, Gitee’s chief architect Chen Xin explains why Elasticsearch was chosen for code search, outlines how combining search with NLP can both aid semantic understanding and enrich repository queries, and shares his views on the platform’s fast‑evolving ecosystem and upcoming community meetup.

ElasticsearchGiteeNLP
0 likes · 4 min read
Elasticsearch Retrieval Optimization in Gitee: Interview with Chen Xin
DataFunTalk
DataFunTalk
Nov 11, 2019 · Artificial Intelligence

Knowledge Graph‑Based Question Answering in Meituan’s Intelligent Interaction Scenarios

This talk presents how Meituan leverages knowledge‑graph QA (KBQA) across restricted and complex smart‑interaction scenarios, compares semantic‑parsing and information‑retrieval approaches, introduces three‑layer concept nodes to handle entity explosion and non‑connected queries, and outlines architectural refinements for multi‑turn dialogue integration.

AIDialogue SystemsMeituan
0 likes · 14 min read
Knowledge Graph‑Based Question Answering in Meituan’s Intelligent Interaction Scenarios
JD Tech Talk
JD Tech Talk
Oct 30, 2019 · Artificial Intelligence

Solution Overview for the Scientific Paper Recommendation Matching Competition

This article presents a comprehensive solution to a competition that requires matching description paragraphs with the three most relevant papers from a 200,000‑paper corpus, detailing background, task definition, evaluation metrics, modeling strategy, and core algorithms such as SIF, InferSent, Bi‑LSTM, and BERT.

BERTNLPcompetition
0 likes · 9 min read
Solution Overview for the Scientific Paper Recommendation Matching Competition
DataFunTalk
DataFunTalk
Sep 5, 2019 · Artificial Intelligence

Baidu Semantic Computing: ERNIE, SimNet, and Future Directions in Natural Language Processing

This article reviews Baidu's research on semantic computing, covering the evolution of semantic representation, the development and evaluation of the ERNIE and SimNet models, their industrial applications, model compression techniques, and outlines future research priorities in multilingual and multimodal semantic understanding.

Deep LearningErnieSemantic Representation
0 likes · 12 min read
Baidu Semantic Computing: ERNIE, SimNet, and Future Directions in Natural Language Processing
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 31, 2019 · Artificial Intelligence

How Alibaba’s Enriched BERT Set a New Record in Open‑Domain QA

Alibaba’s AI team introduced a multi‑stage, document‑ranking and paragraph‑ranking system built on an Enriched BERT model that topped the MS MARCO reading‑comprehension leaderboard, surpassing previous state‑of‑the‑art methods and even human performance on open‑domain QA tasks.

Alibaba AIEnriched BERTOpen Domain QA
0 likes · 5 min read
How Alibaba’s Enriched BERT Set a New Record in Open‑Domain QA
AntTech
AntTech
Jul 21, 2019 · Artificial Intelligence

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

At SIGIR 2019 in Paris, Alipay presented two AI research papers—one applying reinforcement learning to predict user intent in customer‑service bots and another introducing the unsupervised QUEST method that builds noisy quasi‑knowledge graphs for answering complex multi‑document questions.

AIReinforcement LearningUnsupervised Learning
0 likes · 5 min read
Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering
DataFunTalk
DataFunTalk
Jul 30, 2018 · Artificial Intelligence

Enhancing Automated Process Services with Multi‑Turn Dialogue: Insights from Chatopera’s NLP Solutions

The article presents a technical overview of Chatopera’s multi‑turn dialogue platform, covering language model fundamentals, Chinese segmentation, word embeddings, information retrieval, and open‑source tools, while illustrating how these AI techniques enable low‑cost, scalable enterprise chatbot solutions.

NLPinformation retrievalmulti-turn dialogue
0 likes · 11 min read
Enhancing Automated Process Services with Multi‑Turn Dialogue: Insights from Chatopera’s NLP Solutions
IT Xianyu
IT Xianyu
Apr 17, 2018 · Fundamentals

The Origins of Internet Search: Archie and WAIS

Archie, created in 1989 by Peter Deutsch at McGill University, was the first internet search tool that indexed FTP sites, while WAIS, developed by Brewster Kahle at Thinking Machines, extended searchable databases worldwide, both highlighting early challenges in managing large communication traffic and user-friendly interfaces.

ArchieInternet HistoryNetworking
0 likes · 3 min read
The Origins of Internet Search: Archie and WAIS
Architecture Digest
Architecture Digest
Feb 1, 2018 · Fundamentals

How Search Engines Work: Building Inverted Indexes

This article explains the core of search engine technology by describing what an inverted index is, how it is built using single‑pass memory and multi‑way merge methods, how indexes can be partitioned and incrementally updated, and how Hadoop can be used for large‑scale indexing.

Big DataHadoopindexing
0 likes · 10 min read
How Search Engines Work: Building Inverted Indexes
21CTO
21CTO
Oct 12, 2017 · Artificial Intelligence

How Advanced Autocomplete Algorithms Boost Search Experience

This article explains the principles, algorithms, and practical challenges of search autocomplete (query suggestion), covering popularity‑based models, time‑sensitive methods, user‑aware and context‑aware approaches, data pipelines, indexing, ranking, personalization, and evaluation techniques used in e‑commerce search systems.

autocompletee‑commerceinformation retrieval
0 likes · 15 min read
How Advanced Autocomplete Algorithms Boost Search Experience
Baixing.com Technical Team
Baixing.com Technical Team
Sep 11, 2017 · Artificial Intelligence

How Do Search Engines Decode User Intent? Exploring Query Extension Techniques

This article explains how modern search engines identify precise and broad user intents, examines real‑world query examples, and details extension modules such as synonym, pinyin, and correction that enhance query understanding using algorithms like Aho‑Corasick, Hidden Markov Models, and Levenshtein distance.

Searchinformation retrievalnatural language processing
0 likes · 10 min read
How Do Search Engines Decode User Intent? Exploring Query Extension Techniques
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 7, 2017 · Artificial Intelligence

Probabilistic Pair Recommendations & IRGAN: Boosting E‑commerce Click‑Through

This article summarizes two SIGIR 2017 papers: one introduces a probabilistic latent‑class model for shopping‑pair push recommendations that improves e‑commerce click‑through rates by leveraging co‑purchase and view‑then‑purchase graphs, and the other presents IRGAN, a GAN‑based framework that unifies generative and discriminative information‑retrieval models, achieving state‑of‑the‑art results across web search, recommendation, and QA tasks.

GANe‑commerceinformation retrieval
0 likes · 9 min read
Probabilistic Pair Recommendations & IRGAN: Boosting E‑commerce Click‑Through