How WaterFlow Uses AI Agents to Automate Taobao’s Recommendation Development
The article describes WaterFlow, an AI‑driven end‑to‑end development platform at Taobao that turns natural‑language requirements into PRDs, multi‑platform code, tests and releases, cutting iteration time from a week to two days and shipping over 30 features with more than 54,000 lines of generated code.
Introduction
WaterFlow is an AI‑driven, end‑to‑end development practice created for Taobao’s recommendation flow. It addresses frequent requirement changes, a complex multi‑platform tech stack, and low collaboration efficiency by using large‑model agents such as a Central Agent and a Code Agent.
Problems Faced
High iteration demand : each requirement used to take about a week to complete.
Multiple tech stacks : iOS, Android, HarmonyOS, Weex, DX, etc., require coordinated changes across five platforms.
Poor collaboration : frequent product‑manager turnover and a large knowledge base caused repeated clarification cycles.
Why Existing AI Tools Were Insufficient
Current AI‑coding tools (autonomous agents, AI editors, rapid‑prototype platforms) either focus on single‑platform tasks or only support prototype validation, lacking the ability to handle complex, multi‑end, full‑pipeline delivery.
WaterFlow Solution
The system builds a full pipeline:
Central Agent converts natural‑language requirements into a PRD and a set of development tasks.
Code Agent (Codex) executes those tasks in a cloud sandbox, generating code for frontend, backend, client, and DX.
Generated artifacts (PRD, technical方案, test reports, data reports) are stored for future reference.
The workflow reduces the requirement‑to‑deployment cycle from one week to two days, with more than 30 demands shipped and over 54 000 lines of code produced.
Architecture
WaterFlow relies on a cloud‑based coding sandbox (Codex) that runs in isolated Docker containers. It integrates a LangGraph‑based agent stack, a specialized code LLM, and tool adapters for Git, MCP, and other services.
Context Layers
Three context layers guide the agents:
System context : immutable rules such as Git operations and output formats.
User context : customizable preferences (coding style, user profile).
Code context : repository‑specific markdown files describing directory structure, workflow, and tech stack.
Workflow per Tech Stack
For each stack (frontend, backend, client, DX) the process includes:
Create a Codex container.
Pull the main branch.
Create a new feature branch.
Generate and apply code changes.
Push the branch, trigger deployment, preview, and code review.
Results
Collaboration efficiency : Central Agent generated over 30 PRDs and tasks (≈30% of total demand) with an average handling time of 10 minutes, turning many “N‑handshakes” into a single handshake.
Development efficiency : Code Agent completed about 90% of tasks automatically; several features were 100% AI‑generated, saving environment setup and manual coding time.
Code output : More than 54 000 lines of Java, JavaScript, XML, and other languages were generated across multiple projects.
Context building : Detailed documentation of tech stacks, directory structures, and DX templates was added to the code context, continuously improving agent performance.
Future Directions
Establish robust evaluation and scoring mechanisms to continuously fine‑tune prompts, contexts, and generated code.
Enable persistent learning so agents can remember past demands and adapt to user preferences.
Conclusion
WaterFlow demonstrates that AI can serve as a high‑level programming language for product teams, automating the full lifecycle from requirement to release and delivering measurable efficiency gains.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
