Cloud Native 7 min read

Kubernetes Automatic Scaling with Custom Metrics Using Prometheus and HPA v2

This article explains how to configure Kubernetes Horizontal Pod Autoscaler (HPA) to scale workloads based on custom business metrics collected by Prometheus, covering installation of Metrics Server, deployment of a demo app, setup of the Prometheus adapter, and practical load‑testing steps.

High Availability Architecture

May 10, 2018

Kubernetes Automatic Scaling with Custom Metrics Using Prometheus and HPA v2

Prometheus is commonly used to monitor Kubernetes clusters, but scaling decisions often rely on business metrics rather than just CPU or memory usage.

Kubernetes provides two autoscaling mechanisms: Cluster Autoscaler for node scaling and Horizontal Pod Autoscaler (HPA) for pod replica scaling. HPA v2 can consume custom metrics via the Custom Metrics API and the aggregation layer.

Setup steps include installing the Metrics Server, deploying a small Go‑based demo application (podinfo), installing Prometheus v2 and the k8s‑prometheus‑adapter, registering the adapter in the API aggregation layer, and creating a dedicated monitoring namespace.

An HPA is defined to keep a minimum of two replicas and to scale up to ten when average CPU exceeds 80% or memory exceeds 200 Mi, and later to scale based on the custom metric http_requests_total exposed by the podinfo app.

Load testing is performed with the hey tool to generate request traffic; the HPA controller periodically queries the Metrics Server and the custom metrics API, increasing replica count when request rate surpasses the defined threshold.

After the load test, the HPA scales the deployment back down, demonstrating how custom business metrics can drive more responsive and SLA‑aware scaling.

Conclusion: Relying solely on CPU/memory is insufficient for many web and mobile back‑ends; using Prometheus‑provided custom metrics enables fine‑grained autoscaling and higher availability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Prometheus Auto Scaling Horizontal Pod Autoscaler custom metrics

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.