How Multi‑Agent AI Architecture Solves Complex Data Generation Challenges
This article details the design and evolution of a multi‑agent AI system for automated data generation in integration testing, covering challenges, single‑ versus multi‑agent approaches, prompt engineering, tool governance, intent recognition, tool filtering, reasoning execution, performance gains, and practical recommendations.
Introduction
In the context of integration testing, "智造" (intelligent manufacturing) specifically refers to data generation (造数) using multiple cooperating agents. This scenario exemplifies a typical AI application where rich user queries, complex business logic, and precise command execution must be coordinated.
Challenges
Accurately extracting commands from diverse queries.
Filtering the appropriate tool from a large pool of hundreds of tools.
Composing tool chains to satisfy complex instructions.
The core difficulty lies in the semantic and functional gap between users and tool authors.
Single‑Agent Mode
All reasoning, tool governance, and prompt engineering are delegated to a single inference agent. This mode works for simple data‑generation scenarios but struggles with accuracy and speed as tool count grows.
Multi‑Agent Mode (智造2.0)
To address the limitations of the single‑agent approach, the architecture is split into several specialized agents:
Intent‑Recognition Agent : Classifies user queries into eight intent types (data creation, data operation, data query, data validation, tool consultation, tool operation, project operation, other) and produces a standardized IntentResult model.
Tool Engine : Consists of a real‑time filtering engine and a backend tool‑parsing agent. It abstracts each tool into a ToolEssentialModel with fields such as function type, environment, domain, dependent entities, and output entities.
Reasoning‑Execution Agent : Performs reverse reasoning to identify the final tool needed, then recursively resolves upstream dependencies, forming a tool chain that is finally executed forward.
Summary‑Interaction Agent : Evaluates the final result against the original query, handles retries, and formats the output.
Prompt Engineering
Key practices include avoiding custom output styles for inference agents, adding concrete examples to improve performance, being aware of hidden attributes in examples, and abstracting the problem space to guide the model.
#角色:
你叫小智,是一个软件研发中的智能机器人,熟悉外卖电商相关知识和术语,能够构造测试数据、执行特定工具。
#背景知识:
##环境知识
数据环境分为线下和预发。线下环境通常也叫做日常环境,工具中有说明。环境极其重要,必须严格区分。
##工具平台
1. 蓝海平台,工具名称中带有“蓝海”字样,仅提供造数功能。
2. 其它工具平台,提供多样化的造数、查询、操作功能。
#职责:
核心职责为:通过沟通理解用户意图,找到合适的工具或工具链,执行并返回结果。
#步骤:
1. 分析用户输入,识别意图(创建、查询、操作数据)。
2. 将意图打印给用户。
3. 根据意图调用相应工具。
4. 返回执行结果。
#原则:
1. 明确用户意图并与工具能力匹配。
2. 如非造数或工具咨询,直接返回并致歉。
3. 明确数据环境。
4. 严格检查工具支持的环境。
5. 如找不到合适工具,直接告知用户。
6. 参数不明确时与用户交互获取。
7. 造数过程严谨,禁止使用臆造数据。
#返回要求:
1. 与用户保持友好、简洁、语义明确的交互。
2. 先返回将要使用的工具,再发起调用。
3. 调用工具后在文本末尾以小字体感谢作者。
4. 如需调用多工具,按顺序返回。
5. 如工具库缺少部分工具,交由用户决定是否执行。
6. 如用户开玩笑,直接以玩笑方式提示。
7. mcp与用户无关,不展示相关信息。
8. 前缀为_mcp_的参数为隐藏参数,不展示。Tool Governance
Tool descriptions must include basic information, functional details, output specifications, and troubleshooting aids. Tools are categorized into public (stable, strict entry/exit) and private (flexible, project‑ or personal‑specific) pools to prevent explosion of tool candidates during LLM interaction.
Intent Recognition
The intent‑recognition module abstracts user queries into intent types and populates an IntentResult model, which standardizes the downstream processing. For example, a query "Give user 1234 a 50‑off coupon in pre‑release environment" is broken down into who, what, where, conditions, and actions.
Tool Engine Details
Tools are abstracted as ToolEssentialModel with core fields: function type, environment, domain, dependent entity, target entity, and output entity. Filtering combines text similarity (with synonym tables) and embedding similarity to reduce hundreds of tools to a handful of candidates, typically around five, regardless of total tool count.
Reasoning Execution
Execution follows a reverse‑reasoning then forward‑execution workflow:
Traverse the execution plan to locate the final tool that satisfies the desired result.
Identify required input entities for that tool.
If an input is not provided by the user, find a tool that can generate it.
Recursively repeat until all dependencies are resolved.
Produce an ordered tool chain (tool a → tool b → … → tool n) and execute it forward.
The reasoning agent uses the powerful qwen‑max model to handle deep inference.
Overall Effect
By filtering the tool set in memory, the system consistently presents only a few stable candidates to the LLM, reducing the candidate count by an order of magnitude and improving latency even as the total number of tools grows.
Recommendations
Use the single‑agent solution when the tool set is focused, the user and tool authors overlap heavily, and language matching is straightforward. Opt for the multi‑agent architecture for large, open platforms with many tools and diverse users, but be aware of the added complexity and debugging difficulty.
Conclusion
The evolution from 智造 1.0 to 智造 2.0 illustrates an iterative process of addressing semantic gaps, tool explosion, and execution reliability. Continuous refinement of intents, tool models, and reasoning strategies is essential for robust AI‑driven data generation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
