11 min read

How AEnvironment Powers Scalable Agentic RL with a Unified MCP Protocol

AEnvironment is an open‑source, unified environment platform for Agentic Reinforcement Learning that abstracts all resources as services via the MCP protocol, enabling trillion‑scale model training, rapid app generation, benchmark integration, and seamless deployment through a high‑performance ASandbox runtime.

AntTech

Dec 18, 2025

How AEnvironment Powers Scalable Agentic RL with a Unified MCP Protocol

Introduction

AEnvironment (AEnv) is an infrastructure for large‑scale Agentic Reinforcement Learning (RL). It standardizes benchmarks, tools, and agents as environment services using the MCP protocol and a high‑performance ASandbox runtime.

Key Features

Ultra‑large‑scale support : Handles trillion‑level model training and massive parallel sampling for long‑context RL.

Agent as Environment : Wraps agents as environments, enabling multi‑machine coordination and hierarchical training.

Rapid application generation : Pre‑bundled toolchains allow fast construction and launch of small‑scale applications.

High‑quality data synthesis : Automates large‑scale environment data and trajectory generation for training.

Built‑in benchmarks : Provides out‑of‑the‑box integration with industry‑standard evaluation suites.

Architecture

AEnvironment adopts a layered design that separates the Development Side (environment definition and metadata management) from the Traffic Side (runtime execution). The Development Side uses the AEnv CLI to push configurations to EnvHub, which stores metadata in Redis. The Traffic Side creates sandboxed runtime instances via the AEnv SDK, an API service, and selectable sandbox engines such as Kubernetes.

This metadata‑driven approach enables version control, rapid iteration, and a consistent interface across agents, tools, and benchmarks.

Python Example

The following snippet shows how to create an environment, initialise an OpenAI‑style agent, and run a loop that interacts with the environment:

from typing import Any, Dict
from agents import Agent as OpenAIAgent
import os
from aenv.core.environment import Environment

# 1. Create and initialise environment
env = Environment(
    env_name="[email protected]",
    environment_variables=dict(
        TAU2_DOMAIN="airline",
        TAU2_TASK_ID="1",
    ),
)
await env.initialize()

# 2. Create Agent
agent = OpenAIAgent(
    name="Tau2 Agent",
    instructions=env.call_function("tau2_get_system_prompt", {}),
    tools=await env.list_openai_tools(),
)

# 3. Interaction loop
step = 0
while step < 100:
    step += 1
    status = await env.call_function("tau2_get_status", {})
    if status.get("done", False):
        break
    result = await Runner.run(
        agent,
        input=status.get("last_observation", "")
    )
    await env.call_function("tau2_send_message", {"message": result.final_output})

# 4. Retrieve final reward
reward = await env.call_reward({})

Full runnable code: https://github.com/inclusionAI/AEnvironment/blob/main/aenv/examples/tau2_rl/agent.py

Benchmark Integration

AEnvironment ships with built‑in benchmark environments such as TAU2‑Bench, SWE‑Bench, and Terminal‑Bench, which can be accessed without additional configuration:

from aenv import Environment
async with Environment("[email protected]") as env:
    tools = await env.list_tools()
    result = await env.call_tool("tau2_get_task_info", {})

Supported benchmark repositories:

TAU2‑Bench: https://github.com/sierra-research/tau2-bench

SWE‑Bench: https://github.com/SWE-bench/SWE-bench

Terminal‑Bench: https://github.com/laude-institute/terminal-bench

Environment Definition Workflow

Define environment : aenv init my-env Register a tool :

from aenv import register_tool

@register_tool
def search_code(query: str, path: str = ".") -> dict:
    """Search for code patterns in files."""
    return {"matches": [...]}

Publish environment : aenv init my-env && aenv build && aenv push Invoke environment to generate trajectories :

async with Environment("[email protected]") as env:
    result = await env.call_tool("search_code", {"query": "def main"})

Agent‑as‑Environment

Any agent can be wrapped as an environment with minimal code:

# Agent A calls Agent B as an environment
async with Environment("[email protected]") as agent_b:
    response = await agent_b.call_tool("chat", {"message": "Hello!"})

Future Roadmap

GUI‑based agent environments

Model‑driven simulation environments

Richer front‑end application scaffolding

Additional mainstream benchmarks

More detailed observation metrics

Dedicated environment‑center UI

Higher‑performance sandbox solutions

Broader integration with popular agent frameworks

Repository

GitHub: https://github.com/inclusionAI/AEnvironment

MCP protocol Scalable AEnvironment Agentic RL Environment Platform Open-source

Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.