Understanding Horizontal Pod Autoscaler (HPA) and KEDA for Elastic Scaling in Kubernetes
This article explains pod‑level elasticity in Kubernetes by detailing the principles, metric types, and limitations of the Horizontal Pod Autoscaler (HPA) and then introduces KEDA as an event‑driven extension that adds true scale‑to‑zero capabilities, complete with configuration examples and code snippets.
Introduction
Traditional elasticity deals with capacity planning vs actual load. In cloud‑native Kubernetes, both node‑level and pod‑level scaling are essential. This article focuses on pod‑level scaling, introducing Horizontal Pod Autoscaler (HPA) and KEDA.
1. HPA Implementation Principles
1.1 What is HPA
HPA (Horizontal Pod Autoscaler) works for Deployments, StatefulSets, etc., but not for objects that cannot be scaled such as DaemonSets. It has evolved through autoscaling/v1, v2beta1, v2beta2, supporting four metric types: Resource, Object, External, Pods.
Check supported API versions with kubectl api-versions | grep autoscal
Metric type description:
Resource: CPU/Memory utilization or average value.
# Resource metric example
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50Object: metrics from external adapters, supports Value and AverageValue.
# Object metric example
- type: Object
object:
metric:
name: requests-per-second
describedObject:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
name: main-route
target:
type: Value
value: 10kPods: metrics of the pods themselves, only AverageValue.
# Pods metric example
- type: Pods
pods:
metric:
name: packets-per-second
target:
type: AverageValue
averageValue: 1kExternal: metrics from outside the cluster, supports Value and AverageValue.
# External metric example
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
env: "stage"
app: "myapp"
target:
type: AverageValue
averageValue: 301.2 HPA Working Principle
Prerequisites: define resource requests and install metrics‑server. Workflow: create HPA, collect CPU usage per pod, compute average, compare with target, calculate desired replicas, enforce min/max limits, repeat periodically (default 30 s). The scaling formula is desiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)] .
Algorithm explanation: if current metric is double the target, replicas double; if half, replicas halve.
Source code snippets for request extraction and utilization calculation are provided.
// calculatePodRequests extracts pod resource requests
func calculatePodRequests(pods []*v1.Pod, resource v1.ResourceName) (map[string]int64, error) {
// implementation...
} // GetResourceUtilizationRatio computes utilization ratio
func GetResourceUtilizationRatio(metrics PodMetricsInfo, requests map[string]int64, targetUtilization int32) (float64, int32, int64, error) {
// implementation...
}1.3 Simple Scaling Example
Deploy an nginx service and an HPA that scales when CPU usage exceeds 30 %.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa
spec:
replicas: 2
selector:
matchLabels:
app: nginx-hpa
template:
metadata:
labels:
app: nginx-hpa
spec:
containers:
- name: nginx-hpa
image: nginx:1.7.9
ports:
- containerPort: 80
resources:
requests:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: nginx-hpa
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx-hpa
---
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 30Load testing can be performed with ab command.
yum install httpd -y
for i in {1..600}
do
ab -c 1000 -n 100000000 http://ServiceIP/
sleep
done2. Limitations of HPA
HPA cannot scale pods to zero, uses request‑based utilization, has issues with multi‑container pods, and suffers from a single‑threaded controller performance bottleneck.
3. Introduction to KEDA
3.1 What is KEDA and its relation to HPA
KEDA (Kubernetes Event‑Driven Autoscaling) adds event‑driven scaling and true scale‑to‑zero capability to Kubernetes. It works together with HPA: KEDA provides external metrics and can trigger HPA, while handling zero‑scale scenarios.
3.2 KEDA Architecture
Core components: Metrics Adapter, HPA Controller, Scalers. Scalers fetch external metrics (e.g., Prometheus, RabbitMQ) and expose them to HPA.
Source code shows how KEDA scales from zero and back.
// Scale from zero when any scaler is active
if currentScale.Spec.Replicas == 0 && isActive {
e.scaleFromZero(...)
} else if !isActive && currentScale.Spec.Replicas > 0 && (scaledObject.Spec.MinReplicaCount == nil || *scaledObject.Spec.MinReplicaCount == 0) {
e.scaleToZero(...)
}
// ... other cases omitted3.3 KEDA Configuration Example
Deploy KEDA operator and define a ScaledObject that references the nginx deployment.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: nginx-scaledobject
namespace: hpa-tmp
spec:
advanced:
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
policies:
- periodSeconds: 30
type: Pods
value: 1
stabilizationWindowSeconds: 30
scaleUp:
policies:
- periodSeconds: 10
type: Pods
value: 1
stabilizationWindowSeconds: 0
cooldownPeriod: 30
maxReplicaCount: 3
minReplicaCount: 1
pollingInterval: 15
scaleTargetRef:
name: nginx-hpa
triggers:
- type: cpu
metadata:
type: Utilization
value: "30"Additional trigger examples for Prometheus, metrics‑server, and cron are shown.
4. Use Cases and Further Exploration
Typical scenarios include batch data extraction with scheduled scaling, event‑driven workloads that can scale to zero, parallel processing jobs with controlled concurrency, fine‑grained scaling speed control, and cost‑optimized traffic shaping.
References
Links to official Kubernetes HPA documentation and KEDA repositories.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.