Deploy Multi‑Agent AI Apps with AgentScope on Alibaba Cloud Kubernetes
This guide explains how to use Alibaba Cloud's AgentScope framework and Container Service to build, orchestrate, and deploy enterprise‑grade AI agents, covering background, core features, step‑by‑step deployment, sandbox integration, and best‑practice recommendations for cloud‑native AI workloads.
Background
Large language models (LLMs) have evolved from auxiliary components to the central decision‑making engine of intelligent systems. Modern AI agents use LLMs to understand goals, decompose tasks, and orchestrate tool execution, enabling autonomous operation in real‑world scenarios.
AgentScope Overview
AgentScope is a multi‑agent application framework and runtime that supports collaborative agents, tool integration, conversation state management, observability, and debugging. It is designed for cloud‑native deployment via containerization and is model‑agnostic, supporting both text and multimodal models.
Transparent and Observable: Full traceability of dialogue state, message passing, tool calls, and model interactions.
Controllable Execution: Implements ReAct‑style reasoning with support for interruption and human‑in‑the‑loop control.
Tool and Knowledge Augmentation: Unified tool management and native Retrieval‑Augmented Generation (RAG) for enterprise knowledge.
Model‑Agnostic & Multimodal: Adapter layer connects to various LLMs and multimodal models.
Modular Multi‑Agent Orchestration: Separate perception, reasoning, memory, and execution modules; supports pipelines such as "controller + executor + reviewer".
Engineering‑Ready: Integrates with logging, monitoring, alerting, and permission systems for production‑grade lifecycle management.
AgentScope Runtime
The runtime component provides service‑oriented deployment, operation, and governance of agents. It can interoperate with open‑source agent frameworks and offers observability, security, and easy scaling on Kubernetes.
Alibaba Cloud Container Service Capabilities
Alibaba Cloud Container Service (ACK) and Container Compute Service (ACS) supply a cloud‑native platform optimized for AI agents.
High Availability: Multi‑AZ control‑plane redundancy and automatic pod migration.
Massive Elasticity: Predictive pre‑scheduling, image caching, and rapid pod startup for large‑scale concurrent workloads.
Strong Security Isolation: MicroVM‑based sandbox containers with network policies and storage isolation per agent task.
Cost‑Effective Pay‑As‑You‑Go: Fine‑grained CPU/GPU allocation (e.g., 0.5 vCPU + 1 GiB) and serverless container provisioning.
Step‑by‑Step Deployment
Step 1 – Prepare ACK Pro Cluster
Create a new ACK Pro cluster (or use an existing one) and download the KubeConfig file.
Create a Docker registry secret in the cluster for private image access.
kubectl create secret docker-registry demo-credential \
-n default \
--docker-server=your-registry \
--docker-username=your-username \
--docker-password=your-passwordStep 2 – Set Up Local Environment
Create a Python virtual environment and install the runtime package.
Export required environment variables for the registry, Kubernetes config, and DashScope API key.
python3 -m venv demo
source demo/bin/activate
pip install agentscope-runtime==1.0.1 export REGISTRY_URL="your-acr-registry"
export REGISTRY_NAMESPACE="your-namespace"
export REGISTRY_USERNAME="your-username"
export REGISTRY_PASSWORD="your-password"
export KUBECONFIG_PATH="/path/to/kubeconfig"
export DASHSCOPE_API_KEY="your-api-key"Step 3 – Build and Deploy the Agent Application
Define the agent using the AgentApp API, register tools, and configure a ReActAgent with the desired LLM (e.g., qwen-turbo).
agent_app = AgentApp(app_name="Friday", app_description="A helpful assistant")
@agent_app.init
async def init_func(self):
self.state_service = InMemoryStateService()
self.session_service = InMemorySessionHistoryService()
await self.state_service.start()
await self.session_service.start()
@agent_app.shutdown
async def shutdown_func(self):
await self.state_service.stop()
await self.session_service.stop()
@agent_app.query(framework="agentscope")
async def query_func(self, msgs, request: AgentRequest = None, **kwargs):
session_id = request.session_id
user_id = request.user_id
state = await self.state_service.export_state(session_id=session_id, user_id=user_id)
toolkit = Toolkit()
toolkit.register_tool_function(execute_python_code)
agent = ReActAgent(
name="Friday",
model=DashScopeChatModel(
"qwen-turbo",
api_key=os.getenv("DASHSCOPE_API_KEY"),
enable_thinking=True,
stream=True,
),
sys_prompt="You're a helpful assistant named Friday.",
toolkit=toolkit,
memory=AgentScopeSessionHistoryMemory(
service=self.session_service,
session_id=session_id,
user_id=user_id,
),
formatter=DashScopeChatFormatter(),
)
if state:
agent.load_state_dict(state)
async for msg, last in stream_printing_messages(agents=[agent], coroutine_task=agent(msgs)):
yield msg, last
new_state = agent.state_dict()
await self.state_service.save_state(user_id=user_id, session_id=session_id, state=new_state)Configure registry, Kubernetes connection, runtime limits, and deployment specifics, then trigger the deployment.
# Registry configuration
registry_config = RegistryConfig(
registry_url=os.getenv("REGISTRY_URL"),
namespace=os.getenv("REGISTRY_NAMESPACE"),
username=os.getenv("REGISTRY_USERNAME"),
password=os.getenv("REGISTRY_PASSWORD"),
)
# Kubernetes connection
k8s_config = K8sConfig(k8s_namespace="default", kubeconfig_path=os.getenv("KUBECONFIG_PATH"))
# Deploy manager
deployer = KubernetesDeployManager(kube_config=k8s_config, registry_config=registry_config, use_deployment=True)
# Runtime resource limits
runtime_config = {
"resources": {
"requests": {"cpu": "200m", "memory": "512Mi"},
"limits": {"cpu": "1000m", "memory": "2Gi"},
},
"image_pull_policy": "IfNotPresent",
"image_pull_secrets": ["demo-credential"],
}
# Deployment parameters
deployment_config = {
"port": "8080",
"replicas": 1,
"image_name": "agent_app",
"image_tag": "linux-amd64",
"base_image": "python:3.10-slim-bookworm",
"requirements": ["agentscope", "fastapi", "uvicorn"],
"environment": {
"PYTHONPATH": "/app",
"LOG_LEVEL": "INFO",
"DASHSCOPE_API_KEY": os.getenv("DASHSCOPE_API_KEY"),
},
"runtime_config": runtime_config,
"deploy_timeout": 300,
"health_check": True,
"platform": "linux/amd64",
"push_to_registry": True,
}
# Execute deployment
result = await agent_app.deploy(deployer, **deployment_config)Step 4 – Verify and Test the Service
After deployment, use the returned service URL to call the endpoints. Example for the asynchronous endpoint:
curl -N -X POST "http://your-service-url:8080/async" \
-H "Content-Type: application/json" \
-d '{"input":[{"role":"user","content":[{"type":"text","text":"请帮我使用python生成一段冒泡排序算法,并执行"}]}],"session_id":"123"}'Sandbox Integration
AgentScope‑Runtime provides three sandbox types (Base, Browser, FileSystem) that can be deployed as containers on ACK Pro. Configure a .env file with sandbox parameters and start the sandbox server.
# Example .env file
HOST="0.0.0.0"
PORT=8000
WORKERS=4
DEFAULT_SANDBOX_TYPE=base
POOL_SIZE=10
AUTO_CLEANUP=True
CONTAINER_PREFIX_KEY=agent-runtime-container-
CONTAINER_DEPLOYMENT=k8s
K8S_NAMESPACE=demo
KUBECONFIG_PATH=/path/to/kubeconfig runtime-sandbox-server --config custom.envAgents acquire sandbox instances via SandboxService for isolated execution.
Conclusion
Industry forecasts predict that by 2028, 95 % of new AI deployments will run on Kubernetes, highlighting the shift toward cloud‑native AI infrastructure. AgentScope’s deep integration with Alibaba Cloud Container Service enables high‑availability, elastic scaling, secure isolation, and cost‑effective serverless containers for production AI agents.
References
AgentScope website: https://agentscope.io/
Alibaba Cloud Container Service (ACK): https://www.aliyun.com/product/ack
Alibaba Cloud Container Compute Service (ACS): https://www.aliyun.com/product/acs
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
