Deploy Multi‑Agent AI Applications on Alibaba Cloud with AgentScope

This guide explains how to build, containerise, and deploy multi‑agent AI applications using the open‑source AgentScope framework on Alibaba Cloud's ACK Pro and ACS services, covering architecture, key features, step‑by‑step deployment, sandbox usage, and testing procedures.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Deploy Multi‑Agent AI Applications on Alibaba Cloud with AgentScope

Background

Since the release of ChatGPT, large language models (LLMs) have evolved from auxiliary components to the core engines of intelligent systems, shifting system design from simple function stacking to smart orchestration. AI agents—autonomous decision‑making entities—are emerging as the next generation of intelligent services.

AgentScope Overview

AgentScope is an open‑source multi‑agent framework from Alibaba that helps developers create, coordinate, and deploy AI agents with tool‑calling capabilities. It offers modular components for perception, reasoning, memory, and execution, along with built‑in observability, state management, and seamless integration with various LLMs and multimodal models.

Transparent and observable : full logging of dialogue state, message flow, tool calls and model interactions.

Controllable execution : supports ReAct‑style loops with runtime interruption and human‑in‑the‑loop.

Tool and knowledge augmentation : unified tool interface, native RAG support.

Model‑agnostic and multimodal : adapter layer for multiple LLM providers.

Modular multi‑agent orchestration : pipeline of perception, reasoning, memory, execution, supporting "master + workers" patterns.

Production‑ready engineering : integrates with logging, monitoring, alerting, and permission systems for end‑to‑end lifecycle management.

Container Service Capabilities

Alibaba Cloud Container Service (ACK) and Container Compute Service (ACS) provide a cloud‑native runtime that is highly available, elastically scalable, and secure. Key features include multi‑AZ redundancy, automatic pod migration, serverless container compute, fine‑grained resource steps, and sandbox isolation based on lightweight VMs.

Deployment Guide

Step 1 – Prepare ACK Pro Cluster

Create or reuse an ACK Pro cluster and store the kubeconfig file locally.

Step 2 – Set Up Local Environment

python3 -m venv demo
source demo/bin/activate
pip install agentscope-runtime==1.0.1
# Environment variables
export REGISTRY_URL="your-acr-registry"
export REGISTRY_NAMESPACE="your-registry-namespace"
export REGISTRY_USERNAME="your-registry-username"
export REGISTRY_PASSWORD="your-registry-password"
export RUNTIME_SANDBOX_REGISTRY="your-acr-registry"
export KUBECONFIG_PATH="/path-to-your-kubeconfig"
export DASHSCOPE_API_KEY="your-api-key"

Step 3 – Build and Run an AgentScope Application

Define an AgentApp in Python, register tools, create a ReAct agent and expose HTTP endpoints.

agent_app = AgentApp(app_name="Friday", app_description="A helpful assistant")

@agent_app.init
async def init_func(self):
    self.state_service = InMemoryStateService()
    self.session_service = InMemorySessionHistoryService()
    await self.state_service.start()
    await self.session_service.start()

@agent_app.shutdown
async def shutdown_func(self):
    await self.state_service.stop()
    await self.session_service.stop()

@agent_app.query(framework="agentscope")
async def query_func(self, msgs, request: AgentRequest = None, **kwargs):
    assert kwargs is not None, "kwargs is Required for query_func"
    session_id = request.session_id
    user_id = request.user_id
    state = await self.state_service.export_state(session_id=session_id, user_id=user_id)
    toolkit = Toolkit()
    toolkit.register_tool_function(execute_python_code)
    agent = ReActAgent(
        name="Friday",
        model=DashScopeChatModel("qwen-turbo", api_key=os.getenv("DASHSCOPE_API_KEY"), enable_thinking=True, stream=True),
        sys_prompt="You're a helpful assistant named Friday.",
        toolkit=toolkit,
        memory=AgentScopeSessionHistoryMemory(service=self.session_service, session_id=session_id, user_id=user_id),
        formatter=DashScopeChatFormatter(),
    )
    if state:
        agent.load_state_dict(state)
    async for msg, last in stream_printing_messages(agents=[agent], coroutine_task=agent(msgs)):
        yield msg, last
    new_state = agent.state_dict()
    await self.state_service.save_state(user_id=user_id, session_id=session_id, state=new_state)

@agent_app.endpoint("/sync")
def sync_handler(request: AgentRequest):
    yield {"status": "ok", "payload": request}

@agent_app.endpoint("/async")
async def async_handler(request: AgentRequest):
    for i in range(5):
        yield {"status": "ok", "payload": request}

Step 4 – Configure Deployment

# Registry configuration
registry_config = RegistryConfig(
    registry_url=os.getenv("REGISTRY_URL"),
    namespace=os.getenv("REGISTRY_NAMESPACE"),
    username=os.getenv("REGISTRY_USERNAME"),
    password=os.getenv("REGISTRY_PASSWORD")
)
# K8s configuration
k8s_config = K8sConfig(k8s_namespace="default", kubeconfig_path=os.getenv("KUBECONFIG_PATH"))
# Deploy manager
deployer = KubernetesDeployManager(kube_config=k8s_config, registry_config=registry_config, use_deployment=True)
# Runtime resources
runtime_config = {
    "resources": {
        "requests": {"cpu": "200m", "memory": "512Mi"},
        "limits": {"cpu": "1000m", "memory": "2Gi"}
    },
    "image_pull_policy": "IfNotPresent",
    "image_pull_secrets": ["demo-credential"]
}
# Deployment specifics
deployment_config = {
    "port": "8080",
    "replicas": 1,
    "image_name": "agent_app",
    "image_tag": "linux-amd64",
    "requirements": ["agentscope", "fastapi", "uvicorn"],
    "base_image": "python:3.10-slim-bookworm",
    "environment": {
        "PYTHONPATH": "/app",
        "LOG_LEVEL": "INFO",
        "DASHSCOPE_API_KEY": os.getenv("DASHSCOPE_API_KEY")
    },
    "resources": runtime_config["resources"],
    "image_pull_policy": runtime_config["image_pull_policy"],
    "image_pull_secrets": runtime_config["image_pull_secrets"],
    "health_check": True,
    "deploy_timeout": 300,
    "platform": "linux/amd64",
    "push_to_registry": True
}
# Execute deployment
result = await agent_app.deploy(deployer, **deployment_config)
print("Deployment successful! URL:", result["url"])

Step 5 – Test the Service

Use aiohttp or curl to call the /sync, /async and streaming endpoints, verifying JSON responses and status codes.

# Example curl
curl -X POST http://your-service-url:8080/async \
    -H "Content-Type: application/json" \
    -d '{"input":[{"role":"user","content":[{"type":"text","text":"Hello, how are you?"}]}],"session_id":"123"}'

Sandbox Deployment

AgentScope‑Runtime ships with Base, Browser, and FileSystem sandboxes that can be launched as containers on ACK Pro. Configure a .env file and start the sandbox server.

# .env example
HOST="0.0.0.0"
PORT=8000
DEFAULT_SANDBOX_TYPE=base
POOL_SIZE=10
AUTO_CLEANUP=True
CONTAINER_DEPLOYMENT=k8s
K8S_NAMESPACE=demo
KUBECONFIG_PATH=/path-to-your-kubeconfig

Run the server: runtime-sandbox-server --config custom.env The command creates sandbox pods that appear in the ACK console and can be used as plugins in downstream AI workflows.

Conclusion

Deploying AgentScope on Alibaba Cloud’s ACK/ACS provides a cloud‑native, observable, secure, and elastically scalable foundation for enterprise‑grade AI agents. This aligns with industry forecasts that the majority of new AI workloads will run on Kubernetes by 2028, making the solution a forward‑looking choice for production AI services.

AgentScope architecture diagram
AgentScope architecture diagram
cloud-nativePythonAI agentsdeploymentKubernetesAgentScope
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.