How Agentic Architectures Power the Next‑Gen Recommendation and Search Systems

This article summarizes a technical ebook that analyzes the evolution of recommendation and search systems—from deep‑learning models to large‑language‑model agents—detailing multi‑agent RAG architectures, Huawei’s KAR knowledge adapters, Baidu’s generative ranking (GRAB), Elasticsearch vector search, and performance results such as a 1.5% AUC lift and GPU‑accelerated throughput gains.

DataFunSummit
DataFunSummit
DataFunSummit
How Agentic Architectures Power the Next‑Gen Recommendation and Search Systems

The ebook "Agentic Architecture and Practice: Building the Next‑Generation Recommendation and Search System" compiles a series of technical analyses on how modern AI agents reshape recommendation and search pipelines under high‑concurrency, multi‑modal, and multi‑hop query scenarios.

Agentic RAG evolution : The authors trace the architecture from a single agent to a multi‑agent system, describing how planning, retrieval, and generation modules cooperate to interpret complex intents. They detail a mixed‑recall pipeline that combines vector, textual, database, and graph sources to improve coverage and accuracy, and they compare GPU‑accelerated indexing versus CPU‑only indexing, reporting quantifiable throughput improvements.

Recommendation system progression : The chapter reviews the shift from traditional deep‑learning recommendation models to large‑language‑model (LLM) and AI‑agent‑driven approaches. It identifies core challenges—noisy implicit feedback, weak semantic understanding, and difficulty mining user intent. Using Huawei Noah’s KAR project as a case study, the authors explain factorized prompting and a multi‑expert knowledge adapter that maps semantic knowledge into the recommendation embedding space. They discuss the design of the multi‑expert network that balances feature dimensionality with real‑time latency, and they cite an AUC increase of 1.5% together with online A/B‑test results.

Baidu GRAB model : The generative ranking model for ads (GRAB) replaces large discrete feature sets and manual feature engineering with an end‑to‑end sequence generation based on the LLM “Scaling Law”. The paper highlights the Q‑Aware RAB causal attention mechanism that introduces query‑aware relative bias for adaptive modeling of complex interactions and temporal signals. It also describes the STS two‑stage training algorithm, heterogeneous token representations, a dual‑loss stacking strategy, and KV‑Cache optimizations that sustain high‑concurrency inference, accompanied by quantified business gains after full deployment.

Additional chapters cover multi‑agent interaction for AI‑for‑good applications, large‑model agents for knowledge discovery and data‑science tasks, observability research of OpenAI Swarm at SF Tech, practical guidance on using Elasticsearch for vector search and building RAG applications, and a forward‑looking discussion on the frontier from big data to big models in search and recommendation.

The ebook provides architecture diagrams, performance evaluation data, and real‑world product case studies, offering a comprehensive reference for engineers building AI‑driven recommendation and search systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ElasticsearchRAGRecommendation SystemsMulti-Agent Architecturesearch systemsGenerative Ranking
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.