How AEnvironment Powers Scalable Agentic RL with a Unified MCP Protocol
AEnvironment is an open‑source, unified environment platform for Agentic Reinforcement Learning that abstracts all resources as services via the MCP protocol, enabling trillion‑scale model training, rapid app generation, benchmark integration, and seamless deployment through a high‑performance ASandbox runtime.
Introduction
AEnvironment (AEnv) is an infrastructure for large‑scale Agentic Reinforcement Learning (RL). It standardizes benchmarks, tools, and agents as environment services using the MCP protocol and a high‑performance ASandbox runtime.
Key Features
Ultra‑large‑scale support : Handles trillion‑level model training and massive parallel sampling for long‑context RL.
Agent as Environment : Wraps agents as environments, enabling multi‑machine coordination and hierarchical training.
Rapid application generation : Pre‑bundled toolchains allow fast construction and launch of small‑scale applications.
High‑quality data synthesis : Automates large‑scale environment data and trajectory generation for training.
Built‑in benchmarks : Provides out‑of‑the‑box integration with industry‑standard evaluation suites.
Architecture
AEnvironment adopts a layered design that separates the Development Side (environment definition and metadata management) from the Traffic Side (runtime execution). The Development Side uses the AEnv CLI to push configurations to EnvHub, which stores metadata in Redis. The Traffic Side creates sandboxed runtime instances via the AEnv SDK, an API service, and selectable sandbox engines such as Kubernetes.
This metadata‑driven approach enables version control, rapid iteration, and a consistent interface across agents, tools, and benchmarks.
Python Example
The following snippet shows how to create an environment, initialise an OpenAI‑style agent, and run a loop that interacts with the environment:
from typing import Any, Dict
from agents import Agent as OpenAIAgent
import os
from aenv.core.environment import Environment
# 1. Create and initialise environment
env = Environment(
env_name="[email protected]",
environment_variables=dict(
TAU2_DOMAIN="airline",
TAU2_TASK_ID="1",
),
)
await env.initialize()
# 2. Create Agent
agent = OpenAIAgent(
name="Tau2 Agent",
instructions=env.call_function("tau2_get_system_prompt", {}),
tools=await env.list_openai_tools(),
)
# 3. Interaction loop
step = 0
while step < 100:
step += 1
status = await env.call_function("tau2_get_status", {})
if status.get("done", False):
break
result = await Runner.run(
agent,
input=status.get("last_observation", "")
)
await env.call_function("tau2_send_message", {"message": result.final_output})
# 4. Retrieve final reward
reward = await env.call_reward({})Full runnable code: https://github.com/inclusionAI/AEnvironment/blob/main/aenv/examples/tau2_rl/agent.py
Benchmark Integration
AEnvironment ships with built‑in benchmark environments such as TAU2‑Bench, SWE‑Bench, and Terminal‑Bench, which can be accessed without additional configuration:
from aenv import Environment
async with Environment("[email protected]") as env:
tools = await env.list_tools()
result = await env.call_tool("tau2_get_task_info", {})Supported benchmark repositories:
TAU2‑Bench: https://github.com/sierra-research/tau2-bench
SWE‑Bench: https://github.com/SWE-bench/SWE-bench
Terminal‑Bench: https://github.com/laude-institute/terminal-bench
Environment Definition Workflow
Define environment : aenv init my-env Register a tool :
from aenv import register_tool
@register_tool
def search_code(query: str, path: str = ".") -> dict:
"""Search for code patterns in files."""
return {"matches": [...]}Publish environment : aenv init my-env && aenv build && aenv push Invoke environment to generate trajectories :
async with Environment("[email protected]") as env:
result = await env.call_tool("search_code", {"query": "def main"})Agent‑as‑Environment
Any agent can be wrapped as an environment with minimal code:
# Agent A calls Agent B as an environment
async with Environment("[email protected]") as agent_b:
response = await agent_b.call_tool("chat", {"message": "Hello!"})Future Roadmap
GUI‑based agent environments
Model‑driven simulation environments
Richer front‑end application scaffolding
Additional mainstream benchmarks
More detailed observation metrics
Dedicated environment‑center UI
Higher‑performance sandbox solutions
Broader integration with popular agent frameworks
Repository
GitHub: https://github.com/inclusionAI/AEnvironment
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
