Boosting Ad Efficiency with Baidu’s Multi‑Agent AI Architecture
In the AI‑native era, Baidu's ad platform adopts a multi‑agent architecture that combines large and small LLMs, SOP‑driven workflows, long‑term memory, and vector databases to achieve high query accuracy, low latency, and significant business gains while tackling challenges such as hallucination, planning, execution, and personalization.
Background and Motivation
With the rise of the AI‑native era, advertising and marketing platforms are undergoing fundamental changes that demand higher efficiency, precision, and a seamless user experience. Baidu’s commercial advertising platform positions the advertising‑marketing intelligent agent as the primary vehicle for delivering commercial value, with generative AI serving as its core.
Core Capabilities of the Intelligent Agent
The platform defines four essential abilities for an agent:
Understanding : Accurately parse natural‑language queries, extract all slots, and translate them into machine‑readable language.
Proactive Planning : Leverage long‑term memory and domain knowledge to reason according to expected logic and flexibly orchestrate actions.
Strong Execution : Integrate numerous business‑system APIs to provide a rich set of functions for complex operations.
Personalized Response : Generate human‑like, context‑aware replies that avoid rigid or irrelevant answers.
Technical Challenges
When rebuilding the advertising platform with LLM agents, the team identified four major hurdles:
Understanding : Ensuring complete slot extraction without loss, despite LLM hallucinations and inconsistent multi‑step reasoning.
Proactive Planning : Preventing agents from entering dead‑loops; current autonomous planning success rates are below 10%.
Execution : Dealing with heterogeneous, poorly documented business‑system interfaces; the platform has >5,000 APIs and 360+ data tables, far exceeding LLM context windows.
Personalized Response : Translating structured system outputs back into natural language while maintaining domain knowledge.
Multi‑Agent Architecture Overview
The solution adopts a five‑layer architecture (see image below) that addresses the challenges through a combination of model, agent, application, memory, and data‑tool layers.
1. Application Layer
Uses SOP (Standard Operating Procedure) scripts to assemble multiple vertical agents (e.g., Lightboat, JarvisBot) for diverse business scenarios.
2. Agent Layer
Provides the Agent Framework infrastructure, vertical agents, and SOP‑driven multi‑agent collaboration. A Director agent interprets user queries, selects appropriate SOPs, and orchestrates downstream agents via a conversation state machine.
3. Model Layer
Employs a “large‑model‑small‑model collaboration” strategy. Small models handle simple tasks within 1 s, while large models (e.g., Wenxin 4.0) process complex queries, leveraging long‑term memory to reduce latency and hallucinations.
4. Memory Layer
All vector data and long‑term memory are stored in BaikalDB, a proprietary distributed database that supports vector, full‑text, and structured search, enabling efficient retrieval for LLM prompts.
5. Data Tools Layer
Includes a Prompt‑tuning platform (iEvalue), automated traffic recording/replay, and multi‑model labeling pipelines that accelerate model development and evaluation.
Case Studies
Lightboat GBI Agent
The first generative‑BI product for advertising, allowing users to query any metric, time range, or audience segment via natural language. The workflow parses user tokens, distributes sub‑tasks to appropriate models, and validates results with a verification model.
JarvisBot
Combines LGUI and AIOps to automate micro‑service governance. It reduces fault‑diagnosis time from >30 minutes to ~1 minute and streamlines approval processes, traffic recording, and replay, delivering substantial productivity gains.
Results and Impact
Query parsing accuracy improved from 85 % to 96 %.
Average response latency dropped to 1.5 s (95th percentile 3.3 s).
Human effort for handling predefined sentence patterns reduced from 3 person‑days to 1 person‑day.
Business consumption increased due to “Lightboat + generative recall” outperforming traditional keyword ads.
Future Reflections
The team emphasizes that LLM hallucination remains a barrier for low‑latency UI scenarios; thus, breaking tasks into smaller steps and augmenting LLMs with memory and planning modules is essential. They also advocate for unified vector‑plus‑full‑text databases (as realized in BaikalDB) to meet the multi‑modal retrieval needs of future AI‑native advertising.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Baidu Tech Salon
Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
