Mastering Elastic Scheduling in Alibaba Cloud ACK for Cost‑Effective Resource Management
This article explains how Alibaba Cloud Container Service (ACK) extends Kubernetes scheduling with custom elastic resource priority, reverse‑order scaling, and resource caps, providing step‑by‑step examples and YAML policies to help enterprises optimize cloud resource allocation and reduce costs.
Background and Challenges
In the cloud era, enterprises can obtain massive compute resources on demand, but the variety of instance types (ECS, ECI) and billing models (monthly subscription, pay‑as‑you‑go, spot instances) increase the complexity of resource management. ACK’s node‑pool feature automates scaling across zones and instance specs, yet customers still face two main challenges:
Differentiated control of business resource usage : ensuring high‑priority workloads run on stable, subscription‑based ECS instances while limiting other workloads on spot or elastic instances.
Pods not released during scale‑down : default scale‑down policies may leave pods on newly added nodes, causing unnecessary charges and requiring manual migration.
Elastic scheduling aims to address these issues by providing multi‑level priority ordering for both scheduling and scale‑down.
Custom Elastic Resource Priority Scheduling
ACK builds on the standard Kubernetes scheduler and adds a “Custom Elastic Resource Priority Scheduling” feature. It offers the following capabilities:
Custom priority policies : Define the order in which pods are placed on different node types during deployment or scaling.
Reverse‑order scale‑down : When scaling down via HPA, pods are terminated according to the inverse of the priority list, ensuring elastic resources are released first.
Dynamic policy updates : Changing a policy instantly re‑orders already scheduled pods without modifying the Deployment.
Resource usage caps : Limit the amount of resources a workload can consume on each instance type.
Multiple usage‑statistics strategies : Options such as ignoring terminating pods or ignoring pods already scheduled before a policy change.
Optimized rolling updates : New pods created during a Deployment rollout are treated as a separate group, eliminating the need to manually adjust ResourcePolicy.
Scenario 1 – Persistent ECS + Auto‑Scaling Spot Instances with Reverse Scale‑Down
Spot instances (formerly bid instances) provide up to 90 % cost savings compared to on‑demand ECS but can be reclaimed at any time. A typical pattern is to run steady workloads on subscription ECS and burst traffic on spot instances.
When traffic peaks, the Horizontal Pod Autoscaler (HPA) creates additional pods, which trigger the node‑pool autoscaler to provision spot instances. After traffic subsides, the reverse‑order scale‑down policy ensures pods on spot nodes are terminated first, allowing the autoscaler to reclaim those nodes and avoid extra charges.
The following ResourcePolicy forces pods to prefer ECS nodes (label unit=first) and fall back to spot nodes (label unit=second) only when ECS nodes are saturated:
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: $example-name
namespace: $example-namespace
spec:
selector:
$example-label-key: $example-label-value
strategy: prefer
units:
- nodeSelector:
alibabacloud.com/nodepool-id: $example-ecs-nodepool-id
resource: ecs
- nodeSelector:
alibabacloud.com/nodepool-id: $example-spot-instance-nodepool-id
resource: ecsAfter applying the policy, pods are first scheduled to the ECS node (label unit=first) and only move to the spot node (label unit=second) when the former is full. During scale‑down, pods on the spot node are removed before those on the ECS node, achieving the desired reverse‑order behavior.
Scenario 2 – Using the Max Option to Limit Resource Usage
Rolling updates often use a “create‑then‑delete” strategy, temporarily increasing resource consumption. To prevent a workload from exhausting a high‑priority resource pool, the Max field can be added to a ResourcePolicy unit, capping the number of pods allowed on that unit.
apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
name: $example-name
namespace: $example-namespace
spec:
selector:
$example-label-key: $example-label-value
strategy: prefer
units:
- nodeSelector:
alibabacloud.com/nodepool-id: $example-ecs-nodepool-id
resource: ecs
max: $example-max
- nodeSelector:
alibabacloud.com/nodepool-id: $example-spot-instance-nodepool-id
resource: ecsWhen the pod count on the first unit reaches max, subsequent pods are scheduled to the next unit. If the last unit is reached, scheduling fails, effectively throttling the workload.
In the example, the policy limits the nginx application to a single pod on the primary ECS node (label unit=first). Excess pods are placed on the secondary node (label unit=second), demonstrating fine‑grained control over resource distribution.
What’s Next
The elastic scheduling feature also supports advanced statistics strategies, label‑based intelligent grouping, and other extensions. By configuring custom priority policies, reverse‑order scaling, and resource caps, enterprises can achieve more efficient cloud resource allocation, lower operational costs, and better handle growth‑driven workload fluctuations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
