Artificial Intelligence 6 min read

Designing Next‑Gen Recommendation and Search Systems with Agentic Architectures

This article reviews cutting‑edge AI search and recommendation techniques—including Alibaba Cloud’s Agentic RAG, Huawei’s LLM‑enhanced recommendation pipeline, and Baidu’s generative ranking model GRAB—detailing their architectural evolution, multimodal retrieval strategies, performance benchmarks, and practical deployment insights.

DataFunSummit

Jul 3, 2026

Designing Next‑Gen Recommendation and Search Systems with Agentic Architectures

The piece is a curated overview from the ebook Intelligent Agent Architecture and Practice: Building the Next‑Generation Recommendation and Search Systems , summarizing three technical case studies that illustrate how modern AI agents are reshaping high‑concurrency, multimodal search and recommendation workloads.

Alibaba Cloud AI Search: Agentic RAG

Based on a talk by Xing Shaomin, the article explains the challenges of handling massive concurrent queries, multimodal data, and multi‑hop reasoning. It describes the evolution from a single‑agent to a multi‑agent architecture that coordinates planning, retrieval, and generation modules. A multi‑path retrieval layer mixes vector, text, database, and graph recall to boost coverage and accuracy. The author also details GPU‑accelerated indexing and query quantization, and mentions extensions such as NL2SQL and multimodal search, with performance figures provided in the original material.

Huawei Noah’s Ark Lab: LLM‑Enhanced Recommendation

The author reviews the transition from deep‑learning‑based recommenders to large‑language‑model (LLM) and AI‑Agent approaches. Core challenges include noisy implicit feedback, limited semantic understanding, and difficulty extracting user intent. Using the KAR project as an example, the article outlines how factorized prompting and a multi‑expert knowledge adapter map semantic knowledge into the recommendation embedding space. Design trade‑offs for the multi‑expert network balance text feature dimensionality with real‑time constraints. Experimental results show an AUC lift of 1.5 % and online A/B‑test validation.

Baidu GRAB: Generative Ranking for Ads

The Baidu commercial tech team’s GRAB (Generative Ranking for Ads) replaces traditional feature‑engineered DLRM pipelines with an end‑to‑end generative sequence model that embeds user behavior and target ads in a unified space. Inspired by LLM scaling laws and Transformer architecture, the model introduces a Q‑Aware RAB causal attention mechanism to adaptively capture complex interactions and temporal signals. To address training efficiency and over‑fitting, a two‑stage STS training algorithm, heterogeneous token representations, and a dual‑loss stacking strategy are employed. KV‑Cache is used to sustain high‑throughput online inference, and the article reports quantified business gains after full deployment.

The ebook’s table of contents lists eight chapters covering multi‑agent interaction for AI‑for‑good, knowledge discovery with LLM agents, observability of OpenAI Swarm‑style systems, and practical guides for Elasticsearch‑based vector search and RAG applications, providing a comprehensive roadmap for building next‑generation AI‑driven recommendation and search platforms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

GPU Acceleration Recommendation Systems Large Language Model AI Search Agentic RAG Generative Ranking

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.