Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray
This article explains how Ant Group’s Ragent framework leverages Ray to create scalable, multi‑tenant AI agents, detailing its background, motivation, and design while outlining the core modules—Profile, Memory, Planning, and Action—that power large‑language‑model agents.
This article introduces Ant Group’s latest Ray‑based distributed agent framework, Ragent, and outlines its background, motivation, and design & implementation.
Background
Ray is the underlying distributed framework used by OpenAI for large‑model training. Ant Group joined Ray early, contributed over 26% of its core code—making it the world’s second‑largest contributor—and now operates more than 1.5 million CPU cores while maintaining the Ray community in China.
Since forming a Ray team in 2017, Ant released its first business‑scenario flow‑graph engine Geaflow in 2018. Between 2018 and 2022, it built several compute engines on Ray, including Realtime, the open‑source Mobius engine, and the inference/scientific‑computing engine Mars, and pioneered a Multi‑Tenant architecture now being considered by the Ray community.
In the 2023‑2024 large‑model era, Ant delivered Unified AI Serving, integrating offline, online, AI inference, and AI deployment for its massive workloads. The newest effort is an AI Agent framework built on Ray, presented in three parts: background, motivation, and design & implementation.
Motivation
LLM‑based agents require four core modules:
Profile : defines the agent’s persona, e.g., a gentle travel assistant handling travel management and data analysis.
Memory : consists of Knowledge (domain and prior knowledge) and Experience (recorded dialogues, user queries, reasoning steps, and action outcomes) to improve future behavior.
Planning : breaks complex tasks into manageable subtasks using algorithms such as Chain‑of‑Thought or Tree‑of‑Thought, akin to flowcharts in programming.
Action : executes tasks based on experience and plans. A key feature is Function Calling, enabling the model to invoke external functions or interact with physical devices.
These four modules constitute the essential components of a large‑language‑model‑based agent.
Design & Implementation
Ragent implements the above modules on top of Ray, leveraging its scalable distributed execution to support massive workloads and multi‑tenant isolation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
