Cloud Native 13 min read

Can AHPA Predict Kubernetes Scaling Before Load Spikes?

This article introduces the Advanced Horizontal Pod Autoscaler (AHPA), explains its three‑stage architecture of data collection, prediction, and scaling, details the RobustScaler forecasting algorithm and CRD‑based deployment, and evaluates its ability to proactively and reactively adjust pod counts with high robustness.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Can AHPA Predict Kubernetes Scaling Before Load Spikes?

Background

Kubernetes offers three scaling strategies: fixed replica count, HPA, and CronHPA. Fixed counts waste resources during load fluctuations, while HPA reacts only after high load, causing latency (elasticity lag). CronHPA follows preset schedules but is complex and can also waste resources. To address these issues, the Advanced Horizontal Pod Autoscaler (AHPA) predicts future load using historical time‑series data, enabling proactive scaling and reducing lag.

AHPA Architecture

The AHPA system consists of three major components (see Figure 2):

Data Collection : Gathers metrics from sources such as Prometheus, Metrics Server, Log Service, or custom monitors, normalizes them, and forwards them to the Prediction module. Supported metrics include CPU, memory, GPU, QPS, RT, and user‑defined indicators.

Prediction : Uses a two‑stage pipeline—Preprocessing (filtering non‑Running pods, handling missing data) followed by the RobustScaler algorithm (see Section “RobustScaler Algorithm”). The Revise sub‑module adjusts the predicted pod count based on proactive, reactive, and user‑defined bounds, selecting the maximum value.

Scaling : Executes pod scaling. Two modes are available: auto (automatic adjustment based on the predicted count) and observer (dry‑run mode for observing AHPA behavior without changing replica numbers).

Deployment

AHPA is deployed in Kubernetes as two Deployments: the AHPA Algorithm (handling the Prediction logic) and the AHPA Controller (handling Data Collection and Scaling). A CustomResourceDefinition (CRD) named AdvancedHorizontalPodAutoscaler configures scaling policies per application. An example CRD specification is shown below:

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: AdvancedHorizontalPodAutoscaler
metadata:
  name: ahpa-demo
spec:
  scaleStrategy: observer
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 40
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  maxReplicas: 100
  minReplicas: 2
  prediction:
    quantile: 95
    scaleUpForward: 180
  instanceBounds:
  - startTime: "2021-12-16 00:00:00"
    endTime: "2022-12-16 24:00:00"
    bounds:
    - cron: "* 0-8 ? * MON-FRI"
      maxReplicas: 15
      minReplicas: 4
    - cron: "* 9-15 ? * MON-FRI"
      maxReplicas: 15
      minReplicas: 10
    - cron: "* 16-23 ? * MON-FRI"
      maxReplicas: 20
      minReplicas: 15

The CRD allows users to set per‑application scaling limits, choose between auto and observer strategies, and define time‑based instance bounds for fine‑grained control.

RobustScaler Algorithm

AHPA’s core forecasting capability relies on the RobustScaler algorithm, which combines two sub‑algorithms:

RobustPeriod : Detects multiple periodicities in a time series using MODWT wavelet transforms, isolating each cycle without interference.

RobustSTL (for periodic data) or RobustTrend (for non‑periodic data): Decomposes the series into trend, seasonal, and residual components. RobustSTL iteratively extracts these components until convergence; RobustTrend extracts trend and residual only.

After decomposition, the Forecasting module predicts future metric values:

Proactive Planning : Shifts detected seasonal components forward, forecasts the trend with exponential smoothing, and predicts residual upper bounds via quantile regression forests.

Reactive Planning : Uses the trend and residual from RobustTrend to estimate the next metric value based on recent minutes of data.

Resource Model

The Resource Model translates predicted metrics into an estimated pod count. For a single metric, a linear model is used; for multiple metrics, a nonlinear model (e.g., queueing theory‑based) is applied.

Model Training & Prediction Workflow

The end‑to‑end process consists of:

Collect the most recent n days of metric data.

Decompose the data with the Forecasting module to obtain periodic, trend, and residual components.

Feed the forecasted metric values into the Resource Estimation model to compute the expected pod count.

For reactive (short‑term) predictions, repeat steps with minute‑level data.

Take the maximum of proactive and reactive pod estimates as the final scaling decision.

Algorithm Evaluation

Experiments show that AHPA can correctly identify periodicity, remains robust to missing data, spikes, and workload changes, and provides early warnings of trend shifts. When data lack clear cycles, AHPA still offers accurate short‑term forecasts, reducing unnecessary scaling actions compared with vanilla HPA.

Conclusion

AHPA enhances cloud‑native elasticity by combining proactive, data‑driven scaling with reactive adjustments, delivering higher resource efficiency and reduced latency. Its RobustScaler foundation, CRD‑based configurability, and high‑availability deployment make it suitable for production workloads on Kubernetes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetestime series forecastingautoscalingCRDRobustScaler
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.