Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray
This article introduces Ant Group’s Ragent, a Ray‑based distributed AI Agent framework, detailing its background, motivations, and design, and explains the four essential modules—Profile, Memory, Planning, and Action—that enable scalable large‑language‑model agents for real‑world applications.
Background
Ray is an open‑source distributed framework originally developed by OpenAI for large‑model training. Ant Group joined the Ray project early, contributed over 26 % of its core code (the second‑largest contributor worldwide), and now runs more than 1.5 million CPU cores in production while maintaining the Ray community in China.
Motivation
Since 2017 Ant’s Ray team has built several engines on top of Ray, including the Geaflow streaming engine (2018), the Realtime and Mobius open‑source engines, and the Mars scientific‑computing engine. The team also pioneered a multi‑tenant architecture that the Ray community only began to consider recently.
In the 2023‑2024 era of large models, Ant delivered Unified AI Serving, a framework that unifies offline, online, inference and deployment workloads for its 1.5 M‑core services.
Design & Implementation of Ragent
Ragent is a Ray‑based AI Agent framework composed of four core modules:
Profile : defines the agent’s persona and role, e.g., a gentle travel assistant.
Memory : includes Knowledge (domain and prior knowledge) and Experience (past dialogues, user queries, reasoning steps, and action results) to help the agent improve over time.
Planning : breaks complex tasks into manageable subtasks using algorithms such as Chain‑of‑Thought or Tree‑of‑Thought.
Action : executes real‑world tasks based on experience and plans, featuring function calling and, in some scenarios, interaction with physical devices.
These components constitute the essential building blocks of a large‑language‑model‑based agent.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
