Cloud Native 28 min read

Achieving Low‑Cost, High‑Elastic Kubernetes Deployments with ACK, ECI, and OpenKruise

This article explains how to use Kubernetes native autoscaling components—HPA, VPA, Cluster Autoscaler—and cloud‑native extensions such as Alibaba Cloud's Virtual Node, Elastic Container Instance, Elastic Workload, and the open‑source OpenKruise to build a cost‑effective, highly elastic architecture on ACK clusters.

Alibaba Cloud Native

May 5, 2022

Achieving Low‑Cost, High‑Elastic Kubernetes Deployments with ACK, ECI, and OpenKruise

Background and Problem

When containerized applications are migrated to Kubernetes, insufficient Node resources in a cluster can cause Pods to remain pending, while over‑provisioning Nodes leads to idle capacity and wasted cost. The goal is to leverage Kubernetes orchestration and cloud elasticity to achieve high availability at low cost.

Two‑Layer Elasticity Model

Kubernetes provides three autoscaling strategies: Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), and Cluster Autoscaler (CA). HPA and VPA operate at the scheduling layer (adjusting Pod replicas or resource requests), whereas CA works at the resource layer by adding or removing Nodes when cluster capacity is insufficient.

Scheduling‑Layer Autoscaling

HPA : Built‑in component that scales Deployment/StatefulSet replicas based on metrics such as CPU or memory usage. It automatically adds or removes Pods to match target utilization.

VPA : Community component that adjusts Pod resource requests/limits vertically. It can also recommend optimal request values for better utilization.

Resource‑Layer Autoscaling

Cluster Autoscaler (CA) : Open‑source controller that watches pending Pods, simulates scheduling on virtual Nodes, and expands Node groups when needed. It also drains and removes under‑utilized Nodes.

Virtual Node and Elastic Container Instance (ECI)

Virtual Node (based on Virtual Kubelet) provides a serverless‑style Node that forwards Pod scheduling to an external provider. Alibaba Cloud implements this as ACK‑Virtual‑Node, which creates Elastic Container Instances (ECI) on demand, allowing Pods to run without pre‑provisioned ECS Nodes.

Scheduling Pods to ECI

Add the label alibabacloud.com/eci=true to individual Pods.

Add the same label to a Namespace to affect all Pods within it.

Use the annotation alibabacloud.com/burst-resource with values eci (fallback to ECI when ECS is insufficient) or eci_only (use ECI exclusively).

These methods require modifying existing workloads but avoid full cluster redesign.

Elastic Workload (Alibaba) and OpenKruise

Elastic Workload is a proprietary Alibaba component that creates a new resource type mirroring HPA behavior. It clones the original workload into “elastic units” and distributes replicas between the original and the units, enabling fine‑grained scaling and priority‑based deletion.

OpenKruise is an open‑source suite that extends Kubernetes with enhanced Workloads (CloneSet, Advanced StatefulSet, etc.) and a WorkloadSpread resource. WorkloadSpread defines multiple subsets (e.g., ECS and ECI) and controls how Pods are distributed across them, supporting horizontal dispersion, weighted placement, and priority‑based rules.

Choosing Between Elastic Workload and WorkloadSpread

If using Deployment on a Kubernetes version < 1.21, Elastic Workload is the only option.

If using Deployment on version ≥ 1.21, prefer WorkloadSpread.

If using CloneSet (OpenKruise), WorkloadSpread is applicable on any version ≥ 1.16.

Low‑Cost, High‑Elasticity Practice at Bixin

Bixin combines the above components to handle several typical scenarios:

Job tasks (e.g., Flink, Jenkins) run exclusively on ECI via the ECI label.

Deployments without HPA use ECI elastic scheduling; when ECS capacity is sufficient, Pods run on ECS, otherwise they fall back to ECI.

Deployments with HPA are wrapped by Elastic Workload so that HPA‑driven Pods are placed on ECI, and scaling down preferentially removes ECI instances.

CloneSet workloads use WorkloadSpread to define ECS and ECI subsets, ensuring normal Pods prefer ECS while overflow Pods go to ECI; manual scaling prefers deleting ECI first.

Monitoring and Cost Control

Because ECI is billed per second, long‑running ECI instances can become expensive. Bixin monitors ECI usage and triggers alerts for instances running beyond a threshold (e.g., three days) so that owners can restart Pods and let them be scheduled back to ECS.

Using VPA for Request Recommendations

Even though VPA’s automatic vertical scaling is experimental, Bixin deploys VPA in Off mode to collect recommended request values. These recommendations are extracted via the VerticalPodAutoscaler resource and aggregated per application to guide manual right‑sizing.

Conclusion

The article outlines how HPA, VPA, CA, Virtual Node, ECI, Elastic Workload, and OpenKruise can be combined to build a Kubernetes deployment that minimizes cost while maintaining elasticity. Ongoing monitoring and selective use of proprietary versus open‑source components allow teams to adapt the solution to their cloud provider and Kubernetes version.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes autoscaling HPA VPA Virtual Node OpenKruise Cluster Autoscaler Elastic Workload

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.