Artificial Intelligence 18 min read

Underlying Logic and Multi‑Agent Architecture of AI Agents in Baidu's Commercial Advertising Platform

The article explains how Baidu's commercial advertising platform leverages generative AI agents—covering their core capabilities of understanding, planning, execution, and persona—to overcome challenges such as hallucination and integration, describing a multi‑layer architecture, key technologies, real‑world case studies, and the resulting performance and operational benefits.

DataFunTalk
DataFunTalk
DataFunTalk
Underlying Logic and Multi‑Agent Architecture of AI Agents in Baidu's Commercial Advertising Platform

In the AI‑Native era, Baidu's commercial advertising platform has been fundamentally transformed by powerful AI agents that improve efficiency, precision, and overall industry operation, with generative AI serving as the core engine for delivering seamless user experiences.

The agents are built on four core capabilities: (1) Listening – parsing natural‑language queries, extracting all slots, and translating them into machine‑readable language; (2) Proactive Planning – leveraging long‑term memory, domain knowledge, and LLM‑driven reasoning to flexibly orchestrate actions; (3) Strong Execution – interfacing with thousands of business APIs to perform complex tasks; and (4) Persona‑Driven Responses – generating human‑like, context‑aware replies.

Two primary application scenarios are highlighted: natural‑language UI control for complex advertising functions, and AI‑driven reasoning for problem diagnosis and solution generation, both requiring sophisticated agent technology rather than single‑step LLM interactions.

The platform faces four technical challenges: accurate slot filling amid hallucination, low success rates for autonomous planning (often leading to dead‑loops), difficulty in reliably invoking diverse business APIs, and translating structured system outputs back into natural language with appropriate persona.

To address these, Baidu adopts a five‑layer multi‑agent architecture: (1) Application Layer – SOP‑driven vertical agents (e.g., Light‑boat, JarvisBot) assemble domain‑specific capabilities; (2) Agent Layer – an Agent Framework provides infrastructure, vertical agents, and SOP‑based multi‑agent collaboration; (3) Model Layer – a mix of large and small models with tools for training, fine‑tuning, and evaluation; (4) Memory Layer – vector data and long‑term memory stored in BaikalDB, supporting vector, full‑text, and structured retrieval; (5) Data Toolset – platforms for prompt optimization, traffic recording/replay, and automated labeling.

Key technical innovations include a "size‑model collaboration" approach that routes simple queries to small models and uses long‑term memory for large‑model queries, SOP‑based multi‑agent collaboration that decomposes complex tasks into specialized vertical agents, and a long‑term memory plus self‑learning loop that continuously improves agent performance.

Real‑world case studies demonstrate the architecture: the Light‑boat GBI agent enables natural‑language BI analysis with 98.5% slot‑parsing accuracy and sub‑2‑second latency; JarvisBot combines multiple agents for automated diagnosis, operation, and traffic replay, drastically reducing PD and response times; the Director‑Diagnosis‑Operation workflow showcases end‑to‑end multi‑agent orchestration.

Business and operational benefits include higher query accuracy (85%→96%), significant consumption growth from AI‑native marketing, reduced development effort (3 PD→1 PD), faster fault localization (30 min→1 min), and streamlined approval and testing processes.

Reflecting on challenges, the authors note that LLM hallucination remains a barrier for low‑latency UI interactions, advocating for task decomposition and auxiliary tools; they also emphasize the need for a unified database that supports vector, full‑text, and structured search, which BaikalDB provides with a dedicated vector index.

AI agentsLarge Language ModelsSOPmulti‑agent architectureadvertising platformLLM memory
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.