How Alibaba Cloud’s Kubernetes Service Enables Seamless Monitoring and Autoscaling
Alibaba Cloud’s Kubernetes service integrates four native monitoring services—SLS, ARMS, AHAS, and Cloud Monitor—while offering enhanced open‑source components and autoscaling mechanisms such as HPA, VPA, cronHPA, Resizer, Cluster‑Autoscaler, and virtual‑kubelet‑autoscaler, enabling cloud‑native apps to achieve robust observability and elastic scaling.
Background
At KubeCon + CloudNativeCon + Open Source Summit, Alibaba Cloud presented its large‑scale cloud‑native practices, the first domestic Open Cloud Native Application Hub, and announced the open‑source project OpenKruise. The company also introduced edge containers (ACK@Edge) and a comprehensive cloud‑native application management and delivery system.
Monitoring Overview
Alibaba Cloud Container Service for Kubernetes (ACK) integrates four native cloud‑monitoring services:
SLS (Log Service) – collects logs from API server components, service‑mesh/ingress layers, and application standard logs, and provides built‑in audit, observability, and analysis capabilities.
ARMS (Application Real‑time Monitoring Service) – gathers performance metrics for Java and PHP applications, including JVM GC counts, slow SQL, and call stacks.
AHAS (Architecture‑Aware Monitoring Service) – visualizes service topology and network flow to aid rapid diagnosis of micro‑service issues.
Cloud Monitor – the general cloud‑monitoring platform.
These components are installed by default and can be enabled during cluster creation.
Open‑Source Integration
ACK enhances and integrates open‑source monitoring solutions in two areas:
Kubernetes Built‑in Monitoring Enhancements
The community’s heapster/metrics‑server and related components are extended for version compatibility. Node diagnostics are improved with NPD extensions (file‑handle monitoring, NTP sync checks, network validation) and an open‑source eventer that forwards events to SLS, Kafka, or DingTalk for ChatOps.
Prometheus Ecosystem Enhancements
Storage & Performance – supports product‑grade TSDB/InfluxDB for durable, high‑performance metric storage.
Metric Collection – fixes accuracy issues and adds exporters for GPU (single‑card, multi‑card, shared‑slice) metrics.
Higher‑Level Observability – provides CRD metric sets for Argo, Spark, TensorFlow, and multi‑tenant scenarios.
Autoscaling Overview
ACK offers two categories of autoscaling components: scheduling‑level and resource‑level.
Scheduling‑Level Autoscaling Components
HPA – horizontal pod autoscaler, extended with an external-metrics-adapter to use cloud service metrics (e.g., Ingress QPS/RT, ARMS GC count, slow‑SQL count).
VPA – vertical pod autoscaler for stateful service scaling and upgrades.
cronHPA – time‑based scaler that predicts periodic load and adjusts resources accordingly.
Resizer – controller that scales core cluster components (e.g., CoreDNS) based on CPU cores or node count.
Resource‑Level Autoscaling Components
Cluster‑Autoscaler – adds nodes when pods cannot be scheduled due to insufficient resources.
virtual‑kubelet‑autoscaler – open‑source component that creates virtual nodes and launches pods on Elastic Container Instances (ECI) when physical nodes are exhausted.
Demo Showcase
The demo application consists of an apiservice calling a sub‑apiservice which accesses a database, with traffic managed by an Ingress. Traffic is simulated with PTS, logs are collected by SLS, performance metrics by ARMS, and external metrics are exposed via alibaba-cloud-metrics-adapter to trigger HPA scaling. When pod resources are saturated, the virtual‑kubelet‑autoscaler creates an ECI instance to handle excess load.
Conclusion
Using ACK’s monitoring and autoscaling capabilities requires only a one‑click installation of the relevant Helm charts. The integrated multi‑dimensional observability and elastic scaling enable cloud‑native applications to achieve high stability and resilience at minimal cost.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
