Artificial Intelligence 24 min read

Unlocking Precise AI Data Generation with Multi‑Agent Architecture

This article explains how a multi‑agent system—comprising intent‑recognition, tool‑engine, and inference agents—solves the challenges of AI‑driven data generation (AI‑造数) by improving accuracy, speed, and scalability through modular design, prompt engineering, and sophisticated tool governance.

Alibaba Cloud Developer

Oct 17, 2025

Unlocking Precise AI Data Generation with Multi‑Agent Architecture

Introduction

In the context of joint‑testing data generation (referred to as "AI‑造数"), we initially used a single‑agent approach. As more tools and scenarios were added, we evolved to a multi‑agent architecture that separates intent recognition, tool engine, and inference execution.

Challenges

Challenge: Accurately extract commands from rich queries, filter the right tool from thousands, and assemble toolchains for complex instructions.

Fundamental difficulty: Semantic and functional gaps between users and tool authors.

Single‑Agent Solution

All responsibilities are handed to a reasoning agent, with careful tool governance and prompt engineering.

Multi‑Agent Solution

We split the system into multiple agents, each focusing on a specific goal, and weaken agents where engineering can replace them, improving response time.

AI‑造数 1.0 – Single‑Agent Mode

Data generation means feeding the user’s query and the system’s tool set to an LLM, letting it decide and invoke tools. After the MCP standard became universal, AI‑造数 became a natural extension.

In this mode, the agent handles memory management, prompt engineering, and LLM interaction.

Architecture Diagram

Tool Governance

Tool descriptions must include basic info, functionality, output, and troubleshooting details. Proper description quality directly impacts model decision quality.

We categorize tools into public (stable, strict) and private (flexible) domains to manage the growing tool pool.

Prompt Engineering

Prompts define the LLM’s workspace as "find and execute tools", fill necessary context, set principles, and provide examples. Key lessons:

Do not enforce output style on reasoning agents; it harms performance.

When principles have little effect, add examples.

Examples may introduce hidden attributes; be aware of implicit fields.

Abstracting models, capabilities, and processes improves accuracy (expanded in version 2.0).

Intent Recognition

We abstract eight intent types (data creation, data operation, data query, data validation, tool inquiry, tool operation, project operation, other) and define an IntentResult model to standardize downstream processing.

Typical intents in joint‑testing include data creation, data operation, data query, and data validation, each requiring different query details and tool filtering.

Examples illustrate how the system parses user queries into structured intent models.

Tool Engine

The engine filters thousands of tools in memory to a handful of relevant candidates for the LLM. It consists of a real‑time filtering module and a backend tool‑parsing agent.

Tools are abstracted into a ToolEssentialModel with fields such as function type, environment, domain, dependent entities, and output entities.

To bridge semantic gaps, we combine text similarity (with synonym tables) and embedding similarity. To address functional gaps, we use primary and secondary tool tracks.

Inference Execution

The inference agent receives high‑quality requests and guides the LLM through a reverse‑reasoning and forward‑execution process:

Identify the final tool that satisfies the goal.

Recursively find prerequisite tools for missing inputs.

Construct a tool chain (tool n → … → tool a) and execute it forward.

We use Qwen‑max for deep reasoning.

Overall Effect

By filtering tools from hundreds to about five candidates, we reduce the LLM’s decision space by an order of magnitude, improving both accuracy and latency as the tool pool grows.

Solution Recommendations

Single‑agent is simple, fast to implement, and works well when the tool set is focused and users and tool authors share language.

Multi‑agent suits open platforms with many tools and users but adds complexity and debugging difficulty.

Final Thoughts

Building AI‑plus‑product systems faces uncertainty; clear principles, standards, and processes are essential. When results are unsatisfactory, tightening user query specifications often yields significant improvements.

Key Takeaways

Creating AI‑driven products requires deep abstraction to handle uncertainty. Ambiguous areas will cause LLMs to waver, so explicit rules and guidance are crucial.

AI Data Generation Intent Recognition Multi-agent Tool Governance

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.