Tagged articles

Semantic Retrieval

20 articles · Page 1 of 1

Jul 5, 2026 · Artificial Intelligence

Eliminating Fragmented Memory with Mandol: An Open‑Source Lightweight In‑Memory Agent System

Mandol tackles the fragmented memory problem of LLM agents by unifying representation, storage, and retrieval in a memory‑native architecture; benchmarked on LoCoMo and LongMemEval it achieves up to 92.21% accuracy, 5× faster latency, and runs efficiently on consumer‑grade hardware without external databases.

Agent MemoryHierarchical MemoryLLM

0 likes · 14 min read

Eliminating Fragmented Memory with Mandol: An Open‑Source Lightweight In‑Memory Agent System

AI Architecture Hub

Jun 5, 2026 · Artificial Intelligence

Memory Mechanisms in Agent Harness: Current Landscape and Challenges

The article surveys memory mechanisms across major Agent Harness frameworks, classifies three memory types, evaluates each system’s implementation, highlights benchmark shortcomings, and presents Mem0 as a unified solution that overcomes capacity, retrieval, and isolation limitations.

AI agentsAgent HarnessBenchmark

0 likes · 19 min read

Memory Mechanisms in Agent Harness: Current Landscape and Challenges

Alibaba Cloud Infrastructure

May 26, 2026 · Cloud Computing

How OSS Vector Bucket Eliminates Needle‑in‑a‑Haystack Searches for Media Asset Platforms

The article examines how Alibaba Cloud OSS Vector Bucket solves the data‑scattered, costly, and inefficient retrieval problems of massive multimodal media asset platforms by unifying storage, providing semantic vector search, and cutting operational expenses up to 95%.

Multimodal DataOSS Vector BucketSemantic Retrieval

0 likes · 9 min read

How OSS Vector Bucket Eliminates Needle‑in‑a‑Haystack Searches for Media Asset Platforms

Data Party THU

May 17, 2026 · Artificial Intelligence

Personalizing AI Agents: Memory, Rolling Context, and Advanced Retrieval Techniques

The article explains how AI agents use memory to retain conversation context, why sending the full history to large language models is inefficient, and presents rolling context windows, inverted‑index pruning, semantic embedding retrieval, and GraphRAG as complementary strategies to build more accurate and personalized agents.

AI memoryGraphRAGLLM Optimization

0 likes · 10 min read

Personalizing AI Agents: Memory, Rolling Context, and Advanced Retrieval Techniques

DeepHub IMBA

May 14, 2026 · Artificial Intelligence

How HyDE Transforms RAG Retrieval from Keyword Matching to Intent Understanding

The article explains how Hypothetical Document Embeddings (HyDE) improve Retrieval‑Augmented Generation by generating a synthetic answer before vector search, allowing the system to embed richer semantic intent rather than relying on shallow keyword similarity, and provides a step‑by‑step implementation using LangChain.

HyDELLMLangChain

0 likes · 6 min read

How HyDE Transforms RAG Retrieval from Keyword Matching to Intent Understanding

Alibaba Cloud Developer

Apr 23, 2026 · Artificial Intelligence

From Data‑Driven Insights to a Decision Center: Ontological Engineering with PolarDB‑PG

The article explains how Ontology—an abstract model of objects, relationships, and actions—can be built on PolarDB‑PG’s intelligent engine to overcome semantic ambiguity and logical hallucination in enterprise LLM agents, describing a three‑layer architecture, OAG retrieval, automatic modeling, fine‑grained permission control, and real‑world supply‑chain use cases.

AI AgentEnterprise AIKnowledge Graph

0 likes · 13 min read

From Data‑Driven Insights to a Decision Center: Ontological Engineering with PolarDB‑PG

Tech Freedom Circle

Jan 5, 2026 · Artificial Intelligence

A Three‑Step Guide to Mastering RAG Semantic‑Loss Interview Questions

RAG (Retrieval‑Augmented Generation) is a hot interview topic, and many candidates stumble on semantic‑loss issues; this article dissects a real JD interview case, identifies three core shortcomings, and presents a three‑step technical solution—structure restoration, semantic splitting, and hybrid retrieval—plus a ready‑to‑use answer template.

AI interviewDocument ParsingHybrid Search

0 likes · 25 min read

A Three‑Step Guide to Mastering RAG Semantic‑Loss Interview Questions

PaperAgent

Dec 18, 2025 · Artificial Intelligence

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

This article presents an ontology‑aware knowledge‑graph RAG framework that transforms complex, hierarchical industrial standard documents into a graph of sections, atomic propositions, and refined triples, achieving nearly double F1 scores on table‑based QA tasks and robust performance on long documents.

Knowledge GraphLLMOntology

0 likes · 6 min read

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

Yiche Technology

Dec 3, 2025 · Artificial Intelligence

How Milvus Powered a Scalable AI Assistant for Car Queries with Vector Search

This article details how an automotive AI assistant migrated from keyword matching to a Milvus‑based vector retrieval system, overcoming semantic gaps, scaling to millions of daily queries, optimizing indexing, introducing multi‑vector and sparse‑vector search, and building a real‑time RAG pipeline with Flink.

AI assistantMilvusRAG

0 likes · 12 min read

How Milvus Powered a Scalable AI Assistant for Car Queries with Vector Search

Amazon Cloud Developers

Sep 16, 2025 · Artificial Intelligence

Elegant Solution to Prompt Bloat: Semantic Retrieval of Tools for Efficient LLM Inference

The article explains how the limited context window of large language models causes prompt bloat when many tool descriptions are embedded, and presents the RAG‑MCP architecture that stores tool metadata in a vector database, uses semantic retrieval to select only the most relevant tools, dramatically shortens prompts, and improves inference speed and tool‑call accuracy.

Amazon BedrockLLMMCP

0 likes · 25 min read

Elegant Solution to Prompt Bloat: Semantic Retrieval of Tools for Efficient LLM Inference

AI Large Model Application Practice

Jul 29, 2025 · Artificial Intelligence

8 Memory Strategies for AI Agents: From Full Recall to Vector Stores

The article examines eight common AI memory techniques—from simple full‑history retention to sophisticated vector‑store and knowledge‑graph approaches—detailing their principles, Python‑style implementations, advantages, drawbacks, and ideal application scenarios for large‑language‑model agents in production environments.

AI memoryKnowledge GraphLLM Context Management

0 likes · 23 min read

8 Memory Strategies for AI Agents: From Full Recall to Vector Stores

Data Thinking Notes

Jul 20, 2025 · Artificial Intelligence

Mastering Context Engineering: Boost LLM Performance with Advanced Techniques

Context Engineering, a new discipline for optimizing large language model inputs, expands context windows, compares with prompt engineering, outlines core techniques like information organization, dynamic management, semantic retrieval, and offers practical applications and recommendations to enhance AI performance across domains.

Large Language ModelsPrompt engineeringSemantic Retrieval

0 likes · 11 min read

Mastering Context Engineering: Boost LLM Performance with Advanced Techniques

Zhihu Tech Column

Oct 10, 2024 · Artificial Intelligence

Massive Multi-Label Text Classification via Semantic Retrieval and Large AI Model

This article presents a method for massive multi-label text classification on Zhihu content by combining a semantic retrieval model with a proprietary large AI model, detailing the challenges of large label spaces, model architecture, loss optimization, and experimental results showing significant accuracy gains.

BGELarge Language ModelSemantic Retrieval

0 likes · 16 min read

Massive Multi-Label Text Classification via Semantic Retrieval and Large AI Model

Xiaohongshu Tech REDtech

Aug 1, 2024 · Artificial Intelligence

Xiaohongshu Search Advertising Recall: Practices, Metrics, and Large‑Model Integration

Xiaohongshu’s search advertising recall system evolves from keyword bidding to BERT‑based vector retrieval and LLM‑enhanced query rewriting, using dual semantic and efficiency models, water‑level metrics, and GPU‑accelerated engineering to achieve 80 % click coverage, 60 % conversion coverage and a 5 % CPM lift.

Artificial IntelligenceLarge Language ModelsSemantic Retrieval

0 likes · 33 min read

Xiaohongshu Search Advertising Recall: Practices, Metrics, and Large‑Model Integration

Baidu Geek Talk

Nov 9, 2023 · Artificial Intelligence

Deep Learning Model Architecture Evolution in Baidu Search

The article chronicles Baidu Search’s Model Architecture Group’s evolution of deep‑learning‑driven search, detailing the shift from inverted‑index to semantic vector indexing, the use of transformer‑based models for text and image queries, large‑scale offline/online pipelines, and extensive GPU‑centric optimizations such as pruning, quantization and distillation, all aimed at delivering precise, cost‑effective results to hundreds of millions of users.

ERNIEGPU inferenceModel Optimization

0 likes · 14 min read

Deep Learning Model Architecture Evolution in Baidu Search

Baidu Geek Talk

Mar 23, 2023 · Artificial Intelligence

Advanced Image Search in Baidu Netdisk: Semantic Vector Retrieval and Multi-Modal Fusion

Baidu Netdisk’s new image search combines ERNIE‑ViL‑based semantic vectors, cross‑modal matching and metadata such as timestamps, GPS and facial tags, using LSH‑optimized indexing to let users find specific photos among billions with natural‑language queries, delivering faster, more accurate results without manual tagging.

ERNIE-ViLLSH hashingMultimodal AI

0 likes · 11 min read

Advanced Image Search in Baidu Netdisk: Semantic Vector Retrieval and Multi-Modal Fusion

Baidu Geek Talk

Oct 11, 2021 · Backend Development

Baidu Search Closed-Door Technical Symposium

The Baidu Search Closed‑Door Technical Symposium, the first core technical forum hosted by Baidu’s Search Architecture Department, brings senior engineers and junior backend developers together to discuss semantic retrieval, data‑driven big‑data processing, and vertical search offline architecture, while offering limited‑capacity sessions, networking gifts, and travel subsidies.

Backend DevelopmentBaidu SearchSemantic Retrieval

0 likes · 6 min read

Baidu Search Closed-Door Technical Symposium

DataFunTalk

Jul 2, 2021 · Artificial Intelligence

Vector Retrieval for Community Forum Search Using Milvus at Dingxiangyuan

This article describes how Dingxiangyuan's algorithm team adopted Milvus for distributed vector indexing to improve semantic search in their community forum, detailing the background, retrieval workflow, various embedding models—including Bi‑Encoder, Spherical Embedding, and Knowledge Embedding—and summarizing the benefits and future applications.

EmbeddingMilvusNLP

0 likes · 10 min read

Vector Retrieval for Community Forum Search Using Milvus at Dingxiangyuan

DataFunTalk

Jan 15, 2021 · Artificial Intelligence

Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices

This talk by Zhihu search algorithm engineer Shen Zhan details the evolution of text relevance models from TF‑IDF/BM25 to deep semantic matching and BERT, explains the challenges of deploying BERT at scale, and describes practical knowledge‑distillation techniques that improve both online latency and offline storage while maintaining search quality.

BERTKnowledge DistillationSemantic Retrieval

0 likes · 14 min read

Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices

DataFunTalk

Jan 2, 2020 · Artificial Intelligence

Improving Zhihu Search: Query Understanding, Term Weighting, Synonym Expansion, Query Rewriting, and Semantic Retrieval

This article details Zhihu's search engineering advances over the past year, covering long‑tail query challenges, term‑weight calculation, synonym expansion, query rewriting with translation models and reinforcement learning, and semantic retrieval using BERT‑based embeddings, while outlining future research directions.

NLPQuery RewritingSearch

0 likes · 14 min read