Deploy Multi‑Agent AI Apps with AgentScope on Alibaba Cloud Kubernetes

This guide explains how to use Alibaba Cloud's AgentScope framework and Container Service to build, orchestrate, and deploy enterprise‑grade AI agents, covering background, core features, step‑by‑step deployment, sandbox integration, and best‑practice recommendations for cloud‑native AI workloads.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Deploy Multi‑Agent AI Apps with AgentScope on Alibaba Cloud Kubernetes

Background

Large language models (LLMs) have evolved from auxiliary components to the central decision‑making engine of intelligent systems. Modern AI agents use LLMs to understand goals, decompose tasks, and orchestrate tool execution, enabling autonomous operation in real‑world scenarios.

AgentScope Overview

AgentScope is a multi‑agent application framework and runtime that supports collaborative agents, tool integration, conversation state management, observability, and debugging. It is designed for cloud‑native deployment via containerization and is model‑agnostic, supporting both text and multimodal models.

Transparent and Observable: Full traceability of dialogue state, message passing, tool calls, and model interactions.

Controllable Execution: Implements ReAct‑style reasoning with support for interruption and human‑in‑the‑loop control.

Tool and Knowledge Augmentation: Unified tool management and native Retrieval‑Augmented Generation (RAG) for enterprise knowledge.

Model‑Agnostic & Multimodal: Adapter layer connects to various LLMs and multimodal models.

Modular Multi‑Agent Orchestration: Separate perception, reasoning, memory, and execution modules; supports pipelines such as "controller + executor + reviewer".

Engineering‑Ready: Integrates with logging, monitoring, alerting, and permission systems for production‑grade lifecycle management.

AgentScope Runtime

The runtime component provides service‑oriented deployment, operation, and governance of agents. It can interoperate with open‑source agent frameworks and offers observability, security, and easy scaling on Kubernetes.

Alibaba Cloud Container Service Capabilities

Alibaba Cloud Container Service (ACK) and Container Compute Service (ACS) supply a cloud‑native platform optimized for AI agents.

High Availability: Multi‑AZ control‑plane redundancy and automatic pod migration.

Massive Elasticity: Predictive pre‑scheduling, image caching, and rapid pod startup for large‑scale concurrent workloads.

Strong Security Isolation: MicroVM‑based sandbox containers with network policies and storage isolation per agent task.

Cost‑Effective Pay‑As‑You‑Go: Fine‑grained CPU/GPU allocation (e.g., 0.5 vCPU + 1 GiB) and serverless container provisioning.

Step‑by‑Step Deployment

Step 1 – Prepare ACK Pro Cluster

Create a new ACK Pro cluster (or use an existing one) and download the KubeConfig file.

Create a Docker registry secret in the cluster for private image access.

kubectl create secret docker-registry demo-credential \
    -n default \
    --docker-server=your-registry \
    --docker-username=your-username \
    --docker-password=your-password

Step 2 – Set Up Local Environment

Create a Python virtual environment and install the runtime package.

Export required environment variables for the registry, Kubernetes config, and DashScope API key.

python3 -m venv demo
source demo/bin/activate
pip install agentscope-runtime==1.0.1
export REGISTRY_URL="your-acr-registry"
export REGISTRY_NAMESPACE="your-namespace"
export REGISTRY_USERNAME="your-username"
export REGISTRY_PASSWORD="your-password"
export KUBECONFIG_PATH="/path/to/kubeconfig"
export DASHSCOPE_API_KEY="your-api-key"

Step 3 – Build and Deploy the Agent Application

Define the agent using the AgentApp API, register tools, and configure a ReActAgent with the desired LLM (e.g., qwen-turbo).

agent_app = AgentApp(app_name="Friday", app_description="A helpful assistant")

@agent_app.init
async def init_func(self):
    self.state_service = InMemoryStateService()
    self.session_service = InMemorySessionHistoryService()
    await self.state_service.start()
    await self.session_service.start()

@agent_app.shutdown
async def shutdown_func(self):
    await self.state_service.stop()
    await self.session_service.stop()

@agent_app.query(framework="agentscope")
async def query_func(self, msgs, request: AgentRequest = None, **kwargs):
    session_id = request.session_id
    user_id = request.user_id
    state = await self.state_service.export_state(session_id=session_id, user_id=user_id)
    toolkit = Toolkit()
    toolkit.register_tool_function(execute_python_code)
    agent = ReActAgent(
        name="Friday",
        model=DashScopeChatModel(
            "qwen-turbo",
            api_key=os.getenv("DASHSCOPE_API_KEY"),
            enable_thinking=True,
            stream=True,
        ),
        sys_prompt="You're a helpful assistant named Friday.",
        toolkit=toolkit,
        memory=AgentScopeSessionHistoryMemory(
            service=self.session_service,
            session_id=session_id,
            user_id=user_id,
        ),
        formatter=DashScopeChatFormatter(),
    )
    if state:
        agent.load_state_dict(state)
    async for msg, last in stream_printing_messages(agents=[agent], coroutine_task=agent(msgs)):
        yield msg, last
    new_state = agent.state_dict()
    await self.state_service.save_state(user_id=user_id, session_id=session_id, state=new_state)

Configure registry, Kubernetes connection, runtime limits, and deployment specifics, then trigger the deployment.

# Registry configuration
registry_config = RegistryConfig(
    registry_url=os.getenv("REGISTRY_URL"),
    namespace=os.getenv("REGISTRY_NAMESPACE"),
    username=os.getenv("REGISTRY_USERNAME"),
    password=os.getenv("REGISTRY_PASSWORD"),
)
# Kubernetes connection
k8s_config = K8sConfig(k8s_namespace="default", kubeconfig_path=os.getenv("KUBECONFIG_PATH"))
# Deploy manager
deployer = KubernetesDeployManager(kube_config=k8s_config, registry_config=registry_config, use_deployment=True)
# Runtime resource limits
runtime_config = {
    "resources": {
        "requests": {"cpu": "200m", "memory": "512Mi"},
        "limits": {"cpu": "1000m", "memory": "2Gi"},
    },
    "image_pull_policy": "IfNotPresent",
    "image_pull_secrets": ["demo-credential"],
}
# Deployment parameters
deployment_config = {
    "port": "8080",
    "replicas": 1,
    "image_name": "agent_app",
    "image_tag": "linux-amd64",
    "base_image": "python:3.10-slim-bookworm",
    "requirements": ["agentscope", "fastapi", "uvicorn"],
    "environment": {
        "PYTHONPATH": "/app",
        "LOG_LEVEL": "INFO",
        "DASHSCOPE_API_KEY": os.getenv("DASHSCOPE_API_KEY"),
    },
    "runtime_config": runtime_config,
    "deploy_timeout": 300,
    "health_check": True,
    "platform": "linux/amd64",
    "push_to_registry": True,
}
# Execute deployment
result = await agent_app.deploy(deployer, **deployment_config)

Step 4 – Verify and Test the Service

After deployment, use the returned service URL to call the endpoints. Example for the asynchronous endpoint:

curl -N -X POST "http://your-service-url:8080/async" \
-H "Content-Type: application/json" \
-d '{"input":[{"role":"user","content":[{"type":"text","text":"请帮我使用python生成一段冒泡排序算法,并执行"}]}],"session_id":"123"}'

Sandbox Integration

AgentScope‑Runtime provides three sandbox types (Base, Browser, FileSystem) that can be deployed as containers on ACK Pro. Configure a .env file with sandbox parameters and start the sandbox server.

# Example .env file
HOST="0.0.0.0"
PORT=8000
WORKERS=4
DEFAULT_SANDBOX_TYPE=base
POOL_SIZE=10
AUTO_CLEANUP=True
CONTAINER_PREFIX_KEY=agent-runtime-container-
CONTAINER_DEPLOYMENT=k8s
K8S_NAMESPACE=demo
KUBECONFIG_PATH=/path/to/kubeconfig
runtime-sandbox-server --config custom.env

Agents acquire sandbox instances via SandboxService for isolated execution.

Conclusion

Industry forecasts predict that by 2028, 95 % of new AI deployments will run on Kubernetes, highlighting the shift toward cloud‑native AI infrastructure. AgentScope’s deep integration with Alibaba Cloud Container Service enables high‑availability, elastic scaling, secure isolation, and cost‑effective serverless containers for production AI agents.

References

AgentScope website: https://agentscope.io/

Alibaba Cloud Container Service (ACK): https://www.aliyun.com/product/ack

Alibaba Cloud Container Compute Service (ACS): https://www.aliyun.com/product/acs

deploymentKubernetesAI AgentMulti-agentAlibaba CloudContainer ServiceAgentScope
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.