Cloud Native 6 min read

Full-Stack Monitoring with Prometheus & Grafana on Kubernetes (Part 3)

This guide walks through deploying Prometheus and Grafana in a Kubernetes cluster using binary installation, detailing the Prometheus scrape configurations for core components, the necessary Service and Endpoints manifests, and how to reload the configuration to enable full‑stack monitoring.

Linux Cloud-Native Ops Stack
Linux Cloud-Native Ops Stack
Linux Cloud-Native Ops Stack
Full-Stack Monitoring with Prometheus & Grafana on Kubernetes (Part 3)

In this article the author demonstrates how to set up full‑stack monitoring of Kubernetes components using Prometheus and Grafana with a binary (non‑Helm) deployment.

Prometheus scrape configuration

The prometheus.yml file defines separate jobs for the Kubernetes API servers, scheduler, controller‑manager, kube‑proxy and kubelet. Each job uses HTTPS with the cluster’s ServiceAccount for authentication, specifies tls_config that points to the mounted ca.crt, enables insecure_skip_verify: true for self‑signed certificates, and provides the token file. Relabel rules keep only the desired endpoints and map node labels.

- job_name: 'kubernetes-apiservers'
    metrics_path: /metrics
    scheme: https
    kubernetes_sd_configs:
      - role: endpoints
        namespaces:
          names: ["default"]
    # Use Prometheus's own ServiceAccount for authentication
    tls_config:
      # Path to ServiceAccount certificate automatically mounted in Pod
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)

  - job_name: 'kubernetes-scheduler'
    scrape_interval: 15s
    scrape_timeout: 10s
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true   # Binary self‑signed certificate, recommended to skip verification
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
      - role: endpoints
        namespaces:
          names: ["kube-system"]
    relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name]
        action: keep
        regex: kube-system;kube-scheduler
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)

  - job_name: 'kubernetes-controller-manager'
    scrape_interval: 15s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
      - role: endpoints
        namespaces:
          names: ["kube-system"]
    relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name]
        action: keep
        regex: kube-system;kube-controller-manager
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)

  - job_name: 'kubernetes-kube-proxy-up'
    scrape_interval: 30s
    scrape_timeout: 5s
    metrics_path: /metrics
    scheme: http
    kubernetes_sd_configs:
      - role: node
    relabel_configs:
      - source_labels: [__address__]
        regex: '(.*):10250'
        target_label: __address__
        replacement: '${1}:10249'

  - job_name: 'kubernetes-kubelet-up'
    scrape_interval: 30s
    scrape_timeout: 5s
    metrics_path: /metrics
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
      - role: node
    relabel_configs:
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics

Service and Endpoints manifests for automatic discovery

To make the scheduler and controller‑manager discoverable, the article provides Service and Endpoints YAML files that expose the new HTTPS ports (10259 for the scheduler, 10257 for the controller‑manager). The Service objects use ClusterIP: None to create headless services, and the corresponding Endpoints objects list the pod IPs.

apiVersion: v1
kind: Service
metadata:
  name: kube-scheduler
  namespace: kube-system
  labels:
    k8s-app: kube-scheduler
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: http-metrics
      port: 10259   # New HTTPS port
      targetPort: 10259
      protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  name: kube-scheduler
  namespace: kube-system
  labels:
    k8s-app: kube-scheduler
subsets:
  - addresses:
      - ip: 192.168.90.3
    ports:
      - name: http-metrics
        port: 10259
        protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: kube-controller-manager
  namespace: kube-system
  labels:
    k8s-app: kube-controller-manager
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: http-metrics
      port: 10257   # New HTTPS port
      targetPort: 10257
      protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  name: kube-controller-manager
  namespace: kube-system
  labels:
    k8s-app: kube-controller-manager
subsets:
  - addresses:
      - ip: 192.168.90.3
    ports:
      - name: http-metrics
        port: 10257
        protocol: TCP

Reloading the configuration

After editing prometheus.yml and applying the Service/Endpoints resources, the configuration can be reloaded without restarting Prometheus. The article shows a screenshot of the Prometheus UI confirming that the new jobs appear and metrics are being collected.

Finally, the author notes that the default template only includes the API server metrics; additional components can be added by extending the scrape jobs and corresponding Service definitions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MonitoringCloud NativeKubernetesPrometheusGrafanaPrometheus Scrape Config
Linux Cloud-Native Ops Stack
Written by

Linux Cloud-Native Ops Stack

Focused on practical internet operations, sharing server monitoring, troubleshooting, automated deployment, and cloud-native tech insights. From Linux basics to advanced K8s, from ops tools to architecture optimization, helping engineers avoid pitfalls, grow quickly, and become your tech companion.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.