Unlock Second-Scale Elastic Scheduling with ACK Virtual Nodes
This article explains how to use Alibaba Cloud Container Service (ACK) virtual nodes and Elastic Container Instances (ECI) to achieve second‑scale elasticity, covering installation, ResourcePolicy configuration, zone‑aware scheduling, high‑availability setups, and performance results with concrete YAML examples.
Background
Alibaba Cloud Container Service for Kubernetes (ACK) extends the standard K8s scheduler with elastic scheduling capabilities. By integrating Elastic Container Instances (ECI), ACK can provision resources within seconds, enabling rapid response to traffic spikes and large‑scale data processing demands.
Installing Virtual Nodes
To use ECI in an ACK cluster, install the virtual‑node component:
In an ACK Pro cluster, deploy ack-virtual-node via the component management page; the component is managed and does not consume worker node resources.
In an ACK Dedicated cluster, install the component from the marketplace, which creates a ack-virtual-node-controller deployment in the kube-system namespace that runs on your worker nodes.
After installation, kubectl get nodes will list the virtual nodes, confirming successful deployment.
Configuring Elastic Scheduling with ResourcePolicy
Define a ResourcePolicy to specify the order in which resources are used. The following example prefers ECS resources and falls back to ECI when ECS is exhausted:
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: test
spec:
strategy: prefer
units:
- resource: ecs
- resource: eciApply this policy to the default namespace; all pods will follow the defined scheduling rule.
Note: This configuration disables preemption on ECS nodes. To retain preemption, set preemptPolicy=BeforeNextUnit. Use selector to limit the policy’s scope.
Performance Demonstration
Deploy a Deployment with eight pods; only seven are scheduled initially. After applying the ResourcePolicy and scaling the replica count to ten, the new pods are placed on ECI, and pod creation time drops to about 13 seconds—far faster than the minute‑level scaling of traditional node pools.
Images illustrate the pod distribution before and after scaling:
Zone‑Aware Scheduling for Big Data
When a cluster has virtual nodes in multiple zones, ECI pods may be scheduled across zones, increasing network latency for big‑data workloads. To keep pods in the same zone, use node‑affinity or pod‑affinity constraints.
Manual zone specification (node affinity):
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx-deployment-basic
spec:
replicas: 9
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- cn-hongkong-c
containers:
- image: 'nginx:1.7.9'
imagePullPolicy: IfNotPresent
name: nginx
resources:
limits:
cpu: 1500m
requests:
cpu: 1500m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FileWith the above, all pods run in zone C; because the cluster’s ECS nodes are in zone D, the pods are placed on ECI in zone C.
Pod‑affinity for optimal zone selection:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx-deployment-basic
spec:
replicas: 9
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx
topologyKey: topology.kubernetes.io/zone
containers:
- image: 'nginx:1.7.9'
imagePullPolicy: IfNotPresent
name: nginx
resources:
limits:
cpu: 1500m
requests:
cpu: 1500m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FileWhen combined with a policy that prefers ECI, the scheduler selects the zone with the most available ECI capacity, reducing inter‑pod communication latency.
Note: Pod‑affinity can cause subsequent ECI pods to follow the zone of the first (often an ECS pod). Use preferredDuringSchedulingIgnoredDuringExecution to avoid hard binding.
High Availability Across Zones
To ensure business continuity, distribute pods evenly across zones while falling back to ECI when ECS capacity is insufficient. The following combined manifest achieves this:
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: test
spec:
strategy: prefer
units:
- resource: ecs
- resource: eci
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx-deployment-basic
spec:
replicas: 9
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
topologySpreadConstraints:
- labelSelector:
matchLabels:
app: nginx
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
containers:
- image: 'nginx:1.7.9'
imagePullPolicy: IfNotPresent
name: nginx
resources:
limits:
cpu: 1500m
requests:
cpu: 1500m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FileAfter applying, pods are distributed with a maximum skew of 1 (e.g., 5 pods in zone D, 4 pods in zone C), satisfying the high‑availability constraint.
Future Directions
ACK’s elastic scheduling, built on the standard K8s framework, continues to evolve. Upcoming articles will explore managing and scheduling AI workloads on ACK, helping enterprises accelerate AI task deployment in the cloud.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
