Artificial Intelligence 16 min read

From Chain‑of‑Thought to Self‑Evolving Agents: Lessons from Alibaba’s Intelligent Ops

This article traces the evolution of Alibaba’s intelligent agents from the initial chain‑of‑thought design through instantiation, structuring, self‑evolution, and middleware integration, highlighting practical challenges, architectural refinements, and open‑source tools for large‑scale AI operations.

Alibaba Cloud Big Data AI Platform

Aug 8, 2025

From Chain‑of‑Thought to Self‑Evolving Agents: Lessons from Alibaba’s Intelligent Ops

Agent 1.0: Engineering Chain‑of‑Thought

The first version treats the workflow as three core steps—prompt input, LLM response parsing, and tool‑output stitching—forming a simple chain‑of‑thought that lets the model invoke tools without manually specifying each tool’s I/O schema.

While powerful, this approach suffers from imprecise tool parameters and occasional low‑level errors, prompting a redesign of tool handling.

Agent 2.0: Instantiating Agents

Beyond tool instantiation, the agents themselves can be instantiated, allowing each instance to focus on a narrow domain (e.g., a specific network switch). Instantiation lifts parameter handling out of the reasoning loop, enabling a CRUD‑style URI for each agent and a /chat endpoint for interaction.

Two practical issues emerge: unreliable reasoning (missing steps or hallucinations) and the heavy engineering effort required to adapt tools for LLM consumption.

Agent 3.0: Structured Agents

Inspired by frameworks such as AutoGen, CrewAI, OpenAI Swarm, and LangGraph, the team explored hierarchical, role‑based topologies. The “PEER” pattern (Planning, Executing, Expressing, Reviewing) demonstrates how multiple specialized agents can decompose complex tasks and iteratively improve results.

By embedding fixed tool chains and free‑form reasoning within a single agent, workflow optimization becomes equivalent to agent optimization.

Agent 4.0: Self‑Evolving Agents

Drawing on historical ideas of program self‑evolution, the authors argue that mutation granularity matters. Large‑scale models can now mutate at the level of prompts, parameters, or entire workflows, enabling continuous capability growth.

Programming objects – random mutation object – mutation granularity – new‑function evolution

Computer virus – assembly instructions – moderate – none

Ordinary software – source‑code strings – too small – none

Large model – parameters – too small – few

Chain‑of‑thought agent – prompt strings – tiny – few

Structured agent – agents & workflows – moderate – many

Distinguishing between reasoning (language‑level decomposition) and computation (mathematical execution) guides the self‑evolution pipeline.

Agent 5.0: Open‑Source Middleware

The final layer separates the large‑model business platform (ABM‑Mind) from a middleware layer (runnable‑hub) that assembles workers such as prerun, postrun, and chain operations. This design mitigates frequent platform updates and simplifies asynchronous calls.

Compared with Anthropic’s Model Context Protocol (MCP), runnable‑hub offers event‑driven, unbounded reasoning, whereas MCP is a synchronous request‑response protocol with limited depth.

MCP standardizes communication but treats agents and tools as peers.

MCP is blocking and unsuitable for long‑running reasoning.

Runnable‑hub enables deep, time‑unconstrained inference.

Both aim to simplify model‑tool coordination, but with different architectural trade‑offs.

All referenced components—including agentUniverse, ReAct, AutoGen, and LangChain Runnable—are open‑source, and the middleware has been released as part of Alibaba’s SREWorks platform.

workflow Middleware Agent AI engineering

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.