Unlocking Multi‑Agent AI: Architecture and Context‑Engineering Lessons from Alibaba Cloud’s Aivis
The article presents Alibaba Cloud’s Aivis digital‑employee architecture, explains how context engineering and multi‑agent design improve enterprise AI agents, and shares ten practical optimization tips drawn from real‑world deployments and a recent Agentic AI Summit session.
Overview
Alibaba Cloud’s digital employee Aivis demonstrates a shift from traditional rule‑based chatbots to an end‑to‑end, multi‑agent system. The platform combines large‑language‑model (LLM) reasoning with Multi‑Component‑Process (MCP) tools, browser automation, and compute calls, enabling problem‑solving that more closely resembles human reasoning.
Agentic AI Summit Presentation
At the Agentic AI Summit in Beijing (16‑17 January 2026), Jiang Jian presented “Aivis Architecture Practice: Context Engineering and Multi‑Agent Autonomous Services”, describing the concrete implementation steps for building enterprise‑grade agents.
Root Causes of Unsatisfactory Agent Output
Two primary factors lead to poor performance:
Unclear expectations – goals such as “more intelligent” or “correct answers” are vague and cannot be measured.
Insufficient technical implementation – missing or sub‑optimal context handling, tool integration, and orchestration.
1. Define Measurable Expectations
Translate high‑level objectives into concrete, quantifiable criteria (e.g., response latency < 2 s, accuracy ≥ 90 %). This allows developers to pinpoint whether failures stem from prompt design, model misunderstanding, or system bugs.
2. Optimize Context and Architecture
Improving agents can follow two routes: (a) prompt/context engineering and (b) model fine‑tuning (SFT, DPO, RLHF). Because fine‑tuning large models is expensive and requires expertise, the focus is on maximizing performance through context engineering and multi‑agent orchestration without retraining the base model.
Prompt Engineering vs. Context Engineering
Prompt engineering is not a static instruction; in production it is a dynamic composition of system instructions, dialogue history, and external knowledge. Context engineering expands this idea to include:
System‑level instruction assembly.
Efficient management of short‑term dialogue history.
Long‑term memory retrieval and compression.
Tool‑protocol definition and invocation.
These practices treat the LLM as a context‑driven learner rather than a simple prompt responder.
Ten Practical Optimization Practices
Define Task Goals – Convert vague expectations into explicit commands (e.g., “retrieve the latest order status for user X”).
Organize Context Precisely – Feed only the information required for the current step; discard irrelevant data to keep the context “on‑demand”.
Specify Identity and Execution State – Clearly indicate the role of each agent, completed steps, and the current phase (e.g., “You are the data‑retrieval agent, step 2 of 3”).
Use Structured Expressions – Prefer JSON, YAML, or other formal schemas for complex parameters instead of free‑form natural language.
Customize Tool Protocols – Design domain‑specific tool interfaces (e.g., {"action":"search","query":"invoice #1234"}) and enforce stable command contracts.
Apply Example Learning Wisely – Use few‑shot examples only when they add clear value; provide diverse demonstrations for tasks with high variability.
Keep Context Concise – Compress or summarize prior turns without losing essential information to stay within token limits.
Strengthen Memory Management – Reinforce key facts across turns, compress historical logs, and optionally store long‑term state in external databases.
Design Multi‑Agent Collaboration – Employ a workflow engine that routes sub‑tasks to specialized agents, allowing the LLM to make autonomous decisions while preserving overall controllability.
Maintain Human‑in‑the‑Loop – Continuously collect business feedback and incorporate human corrections to keep the agent aligned with real‑world requirements.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
