Cloud Native 16 min read

How to Auto‑Scale Nginx on Kubernetes Using Prometheus Adapter and Custom Metrics

This guide walks through deploying an Nginx sample app on Kubernetes, exposing Prometheus‑collected custom metrics, configuring a Prometheus adapter, and creating a Horizontal Pod Autoscaler that scales the deployment based on request‑per‑second metrics.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Auto‑Scale Nginx on Kubernetes Using Prometheus Adapter and Custom Metrics

One of the main advantages of using Kubernetes for container orchestration is its ability to horizontally scale applications easily; Horizontal Pod Autoscaling (HPA) can scale based on CPU and memory, but more complex scenarios require custom metrics.

Prometheus is the standard tool for monitoring workloads and the Kubernetes cluster itself. The Prometheus adapter exposes collected metrics to the Kubernetes API so that HPA can use them.

Deployment Architecture

We will extract custom metrics from a Prometheus installation using a ConfigMap and let the HPA consume them.

Prerequisites:

Basic knowledge of Horizontal Pod Autoscaling.

Prometheus deployed in the cluster or reachable via an endpoint.

We will use a Prometheus‑Thanos high‑availability deployment.

Deploy Sample Application

First, deploy a sample Nginx application that exposes VTS metrics for Prometheus:

apiVersion: v1
kind: Namespace
metadata:
  name: nginx
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: nginx
  name: nginx-deployment
spec:
  replicas: 1
  template:
    metadata:
      annotations:
        prometheus.io/path: "/status/format/prometheus"
        prometheus.io/scrape: "true"
        prometheus.io/port: "80"
      labels:
        app: nginx-server
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - nginx-server
              topologyKey: kubernetes.io/hostname
      containers:
      - name: nginx-demo
        image: vaibhavthakur/nginx-vts:1.0
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 2500m
          requests:
            cpu: 2000m
        ports:
        - containerPort: 80
          name: http
---
apiVersion: v1
kind: Service
metadata:
  namespace: nginx
  name: nginx-service
spec:
  ports:
  - port: 80
    targetPort: 80
    name: http
  selector:
    app: nginx-server
  type: LoadBalancer

This creates a namespace called nginx and deploys an Nginx pod exposing metrics at /status/format/prometheus. A DNS entry maps nginx.gotham.com to the service’s external IP.

root$ kubectl get deploy
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   1/1     1            1            43d

root$ kubectl get pods
NAME                                 READY   STATUS    RESTARTS   AGE
nginx-deployment-65d8df7488-c578v   1/1     Running   0          9h

root$ curl nginx.gotham.com/status/format/prometheus
# HELP nginx_vts_info Nginx info
# TYPE nginx_vts_info gauge
nginx_vts_info{hostname="nginx-deployment-65d8df7488-c578v",version="1.13.12"} 1
# HELP nginx_vts_server_requests_total The requests counter
nginx_vts_server_requests_total{host="_",code="2xx"} 15574
…

We focus on the nginx_vts_server_requests_total metric to drive scaling decisions.

Create Prometheus Adapter ConfigMap

Define a ConfigMap that tells the adapter which metric to expose:

apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: monitoring
data:
  config.yaml: |
    rules:
    - seriesQuery: 'nginx_vts_server_requests_total'
      resources:
        overrides:
          kubernetes_namespace:
            resource: namespace
          kubernetes_pod_name:
            resource: pod
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))

The ConfigMap currently maps a single metric but can be extended.

Create Prometheus Adapter Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: custom-metrics-apiserver
  name: custom-metrics-apiserver
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: custom-metrics-apiserver
  template:
    metadata:
      labels:
        app: custom-metrics-apiserver
    spec:
      serviceAccountName: monitoring
      containers:
      - name: custom-metrics-apiserver
        image: quay.io/coreos/k8s-prometheus-adapter-amd64:v0.4.1
        args:
        - /adapter
        - --secure-port=6443
        - --tls-cert-file=/var/run/serving-cert/serving.crt
        - --tls-private-key-file=/var/run/serving-cert/serving.key
        - --logtostderr=true
        - --prometheus-url=http://thanos-querier.monitoring:9090/
        - --metrics-relist-interval=30s
        - --v=10
        - --config=/etc/adapter/config.yaml
        ports:
        - containerPort: 6443
        volumeMounts:
        - mountPath: /var/run/serving-cert
          name: volume-serving-cert
          readOnly: true
        - mountPath: /etc/adapter/
          name: config
          readOnly: true
      volumes:
      - name: volume-serving-cert
        secret:
          secretName: cm-adapter-serving-certs
      - name: config
        configMap:
          name: adapter-config

This deployment runs the adapter container that reads the ConfigMap and queries Prometheus.

Create Prometheus Adapter API Service

apiVersion: v1
kind: Service
metadata:
  name: custom-metrics-apiserver
  namespace: monitoring
spec:
  ports:
  - port: 443
    targetPort: 6443
  selector:
    app: custom-metrics-apiserver
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  name: v1beta1.custom.metrics.k8s.io
spec:
  service:
    name: custom-metrics-apiserver
    namespace: monitoring
  group: custom.metrics.k8s.io
  version: v1beta1
  insecureSkipTLSVerify: true
  groupPriorityMinimum: 100
  versionPriority: 100

Now the custom metrics are reachable via the Kubernetes API.

Test the Setup

List available custom metrics:

root$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": [
    {"name": "pods/nginx_vts_server_requests_per_second", "namespaced": true, "kind": "MetricValueList", "verbs": ["get"]},
    {"name": "namespaces/nginx_vts_server_requests_per_second", "namespaced": false, "kind": "MetricValueList", "verbs": ["get"]}
  ]
}

Query the current value of the metric:

root$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/nginx/pods/*/nginx_vts_server_requests_per_second" | jq .
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "items": [
    {
      "describedObject": {"kind": "Pod", "namespace": "nginx", "name": "nginx-deployment-65d8df7488-v575j"},
      "metricName": "nginx_vts_server_requests_per_second",
      "timestamp": "2019-11-19T18:38:21Z",
      "value": "1236m"
    }
  ]
}

Create Horizontal Pod Autoscaler

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-custom-hpa
  namespace: nginx
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metricName: nginx_vts_server_requests_per_second
      targetAverageValue: 4000m

Apply the manifest and check the HPA status:

root$ kubectl describe hpa
Name:               nginx-custom-hpa
Namespace:          nginx
Reference:          Deployment/nginx-deployment
Min replicas:       2
Max replicas:       10
Replicas:           3
…

Generate Load and Observe Scaling

Run a load test with Vegeta:

echo "GET http://nginx.gotham.com/" | vegeta attack -rate=5 -duration=0 | vegeta report

While the load runs, watch the pods and HPA:

root$ kubectl get -w pods
…
root$ kubectl get hpa
NAME               REFERENCE                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-custom-hpa   Deployment/nginx-deployment   5223m/4   2         10        3          5m5s

The HPA scales the deployment up to meet the request‑per‑second target, and scaling back down when the load stops.

Conclusion

This setup demonstrates how to use the Prometheus adapter to feed custom metrics into a Horizontal Pod Autoscaler for automatic scaling of a Kubernetes deployment. While the example uses a single metric, the ConfigMap can be expanded to expose additional metrics for more sophisticated scaling decisions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesDevOpsautoscalingPrometheusHorizontal Pod Autoscalercustom metrics
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.