Artificial Intelligence 18 min read

Boosting Ad Efficiency with Baidu’s Multi‑Agent AI Architecture

In the AI‑native era, Baidu's ad platform adopts a multi‑agent architecture that combines large and small LLMs, SOP‑driven workflows, long‑term memory, and vector databases to achieve high query accuracy, low latency, and significant business gains while tackling challenges such as hallucination, planning, execution, and personalization.

Baidu Tech Salon

May 20, 2024

Boosting Ad Efficiency with Baidu’s Multi‑Agent AI Architecture

Background and Motivation

With the rise of the AI‑native era, advertising and marketing platforms are undergoing fundamental changes that demand higher efficiency, precision, and a seamless user experience. Baidu’s commercial advertising platform positions the advertising‑marketing intelligent agent as the primary vehicle for delivering commercial value, with generative AI serving as its core.

Core Capabilities of the Intelligent Agent

The platform defines four essential abilities for an agent:

Understanding : Accurately parse natural‑language queries, extract all slots, and translate them into machine‑readable language.

Proactive Planning : Leverage long‑term memory and domain knowledge to reason according to expected logic and flexibly orchestrate actions.

Strong Execution : Integrate numerous business‑system APIs to provide a rich set of functions for complex operations.

Personalized Response : Generate human‑like, context‑aware replies that avoid rigid or irrelevant answers.

Technical Challenges

When rebuilding the advertising platform with LLM agents, the team identified four major hurdles:

Understanding : Ensuring complete slot extraction without loss, despite LLM hallucinations and inconsistent multi‑step reasoning.

Proactive Planning : Preventing agents from entering dead‑loops; current autonomous planning success rates are below 10%.

Execution : Dealing with heterogeneous, poorly documented business‑system interfaces; the platform has >5,000 APIs and 360+ data tables, far exceeding LLM context windows.

Personalized Response : Translating structured system outputs back into natural language while maintaining domain knowledge.

Multi‑Agent Architecture Overview

The solution adopts a five‑layer architecture (see image below) that addresses the challenges through a combination of model, agent, application, memory, and data‑tool layers.

1. Application Layer

Uses SOP (Standard Operating Procedure) scripts to assemble multiple vertical agents (e.g., Lightboat, JarvisBot) for diverse business scenarios.

2. Agent Layer

Provides the Agent Framework infrastructure, vertical agents, and SOP‑driven multi‑agent collaboration. A Director agent interprets user queries, selects appropriate SOPs, and orchestrates downstream agents via a conversation state machine.

3. Model Layer

Employs a “large‑model‑small‑model collaboration” strategy. Small models handle simple tasks within 1 s, while large models (e.g., Wenxin 4.0) process complex queries, leveraging long‑term memory to reduce latency and hallucinations.

4. Memory Layer

All vector data and long‑term memory are stored in BaikalDB, a proprietary distributed database that supports vector, full‑text, and structured search, enabling efficient retrieval for LLM prompts.

5. Data Tools Layer

Includes a Prompt‑tuning platform (iEvalue), automated traffic recording/replay, and multi‑model labeling pipelines that accelerate model development and evaluation.

Case Studies

Lightboat GBI Agent

The first generative‑BI product for advertising, allowing users to query any metric, time range, or audience segment via natural language. The workflow parses user tokens, distributes sub‑tasks to appropriate models, and validates results with a verification model.

JarvisBot

Combines LGUI and AIOps to automate micro‑service governance. It reduces fault‑diagnosis time from >30 minutes to ~1 minute and streamlines approval processes, traffic recording, and replay, delivering substantial productivity gains.

Results and Impact

Query parsing accuracy improved from 85 % to 96 %.

Average response latency dropped to 1.5 s (95th percentile 3.3 s).

Human effort for handling predefined sentence patterns reduced from 3 person‑days to 1 person‑day.

Business consumption increased due to “Lightboat + generative recall” outperforming traditional keyword ads.

Future Reflections

The team emphasizes that LLM hallucination remains a barrier for low‑latency UI scenarios; thus, breaking tasks into smaller steps and augmenting LLMs with memory and planning modules is essential. They also advocate for unified vector‑plus‑full‑text databases (as realized in BaikalDB) to meet the multi‑modal retrieval needs of future AI‑native advertising.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Agents large language models multi-agent systems LLM optimization industry insights advertising platform

Written by

Baidu Tech Salon

Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background and Motivation

Core Capabilities of the Intelligent Agent

Technical Challenges

Multi‑Agent Architecture Overview

1. Application Layer

2. Agent Layer

3. Model Layer

4. Memory Layer

5. Data Tools Layer

Case Studies

Lightboat GBI Agent

JarvisBot

Results and Impact

Future Reflections

Baidu Tech Salon

How this landed with the community

Was this worth your time?

0 Comments

Lightboat GBI Agent