Multi-Region Serverless Compute Scheduling with Alibaba Cloud ACK One Registered Cluster
This guide explains how Alibaba Cloud's ACK One registered cluster provides multi‑region serverless GPU compute scheduling, addressing AI workload elasticity by using region‑specific labels, ResourcePolicy, and the ack‑co‑scheduler to automatically balance resources across regions.
As enterprises deepen digital transformation, flexibility and scalability of infrastructure become critical; traditional IDC data centers lack elasticity, prompting the use of Alibaba Cloud ACK One registered clusters, which offer minute‑level access, full Kubernetes compatibility, and serverless elasticity.
In the AI era, massive model parameters increase compute demand, exposing limitations of single‑region GPU resources such as regional GPU type differences and inventory fluctuations, which hinder high‑concurrency inference workloads.
Alibaba Cloud introduces a multi‑region serverless compute scheduling solution for ACK One, aiming to provide unlimited compute supply across regions, enabling large‑scale, low‑latency AI inference deployments.
Users can create an ACK One registered cluster, enable the virtual node component, and label workloads with alibabacloud.com/serverless-region-id to target a specific region. Example deployment YAML:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-gpu-specified-region
name: nginx-gpu-deployment-specified-region
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: nginx-gpu-specified-region
template:
metadata:
labels:
alibabacloud.com/acs: "true"
alibabacloud.com/compute-class: gpu
alibabacloud.com/compute-qos: default
alibabacloud.com/gpu-model-series: example-model # replace with actual model, e.g., T4
alibabacloud.com/serverless-region-id:
# specify region
app: nginx-gpu-specified-region
spec:
containers:
- image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
resources:
limits:
cpu: 1
memory: 1Gi
nvidia.com/gpu: "1"
requests:
cpu: 1
memory: 1Gi
nvidia.com/gpu: "1"To achieve dynamic, multi‑region scheduling, the ack‑co‑scheduler’s ResourcePolicy can be used. The policy selects pods with a specific label and defines units that first try a preferred region and fall back to others when resources are insufficient. Example policy YAML:
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: multi-vk-gpu-resourcepolicy
namespace: default
spec:
selector:
app: nginx-gpu-resourcepolicy
units:
- resource: acs
nodeSelector:
topology.kubernetes.io/region:
type: virtual-kubelet
podLabels:
alibabacloud.com/serverless-region-id:
alibabacloud.com/compute-class: gpu
alibabacloud.com/compute-qos: default
alibabacloud.com/gpu-model-series: example-model
- resource: acs
nodeSelector:
topology.kubernetes.io/region:
type: virtual-kubelet
podLabels:
alibabacloud.com/serverless-region-id:
alibabacloud.com/compute-class: gpu
alibabacloud.com/compute-qos: default
alibabacloud.com/gpu-model-series: example-modelThe business workload can then be deployed with the custom scheduler:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx-gpu-resourcepolicy
name: nginx-gpu-deployment-resourcepolicy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: nginx-gpu-resourcepolicy
template:
metadata:
labels:
app: nginx-gpu-resourcepolicy
spec:
schedulerName: ack-co-scheduler
containers:
- image: 'mirrors-ssl.aliyuncs.com/nginx:stable-alpine'
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
resources:
limits:
cpu: 1
memory: 1Gi
nvidia.com/gpu: "1"
requests:
cpu: 1
memory: 1Gi
nvidia.com/gpu: "1"These configurations enable automatic fallback to alternative regions when a region’s GPU capacity is exhausted, simplifying workload management while ensuring high availability for AI inference services.
For more details, refer to the linked documentation and submit a ticket to request ACS GPU resources.
Alibaba Cloud Infrastructure
For uninterrupted computing services
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.