Cloud Native 12 min read

Full‑Stack Monitoring with Prometheus and Grafana on Kubernetes (Part 2)

This guide walks through deploying Prometheus (v2.51) and Grafana on a Kubernetes cluster, configuring hostPath storage, setting up node‑exporter, adding scrape jobs via Kubernetes service discovery, reloading configurations, and visualizing metrics through Grafana dashboards, with complete YAML examples and screenshots.

Linux Cloud-Native Ops Stack
Linux Cloud-Native Ops Stack
Linux Cloud-Native Ops Stack
Full‑Stack Monitoring with Prometheus and Grafana on Kubernetes (Part 2)

The article explains how to build a full‑stack monitoring solution on Kubernetes using Prometheus and Grafana, highlighting the differences between containerized and physical deployments.

1. Deploy Prometheus (stable v2.51)

A StatefulSet is used with hostPath storage because resources are limited. The service is exposed via NodePort (port 30090). The configuration includes a ConfigMap for prometheus.yml and a ConfigMap for alerting rules.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  namespace: monitoring
  labels:
    app: prometheus
spec:
  serviceName: prometheus
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      nodeName: 192.168.90.6
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
      serviceAccountName: prometheus
      initContainers:
        - name: fix-permissions
          image: busybox:latest
          command: ["sh", "-c", "chown -R 1000:1000 /prometheus"]
          volumeMounts:
            - name: storage
              mountPath: /prometheus
          securityContext:
            runAsUser: 0
      containers:
        - name: prometheus
          image: m.daocloud.io/docker.io/prom/prometheus:v2.51.0
          args:
            - --config.file=/etc/prometheus/prometheus.yml
            - --storage.tsdb.path=/prometheus
            - --storage.tsdb.retention.time=15d
            - --web.enable-lifecycle #支持热加载配置
          volumeMounts:
            - name: config
              mountPath: /etc/prometheus
            - name: rules
              mountPath: /etc/prometheus/rules
            - name: storage
              mountPath: /prometheus
          ports:
            - name: web
              containerPort: 9090
          resources:
            requests:
              cpu: 500m
              memory: 1Gi
            limits:
              cpu: 2000m
              memory: 4Gi
      volumes:
        - name: config
          configMap:
            name: prometheus-config
        - name: rules
          configMap:
            name: prometheus-rules
        - name: storage
          hostPath:
            path: /app/promethues
            type: DirectoryOrCreate
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitoring
  labels:
    app: prometheus
spec:
  type: NodePort
  ports:
    - name: web
      port: 9090
      targetPort: web
      nodePort: 30090
  selector:
    app: prometheus

2. Deploy node‑exporter (cluster node collector)

A DaemonSet runs on every node, using host networking and PID namespaces. Unnecessary collectors are disabled to reduce resource usage. A headless Service is created for discovery.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring
  labels:
    app: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostNetwork: true
      hostPID: true
      tolerations:
        - key: node-role.kubernetes.io/control-plane
          effect: NoSchedule
      containers:
        - name: node-exporter
          image: quay.io/prometheus/node-exporter:latest
          args:
            - --path.procfs=/host/proc
            - --path.sysfs=/host/sys
            - --path.rootfs=/host/root
            - --collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)
          volumeMounts:
            - name: proc
              mountPath: /host/proc
              readOnly: true
            - name: sys
              mountPath: /host/sys
              readOnly: true
            - name: root
              mountPath: /host/root
              mountPropagation: HostToContainer
              readOnly: true
          ports:
            - name: metrics
              containerPort: 9100
              hostPort: 9100
      volumes:
        - name: proc
          hostPath:
            path: /proc
        - name: sys
          hostPath:
            path: /sys
        - name: root
          hostPath:
            path: /
---
apiVersion: v1
kind: Service
metadata:
  name: node-exporter
  namespace: monitoring
  labels:
    app: node-exporter
spec:
  type: ClusterIP
  clusterIP: None # Headless Service
  ports:
    - name: metrics
      port: 9100
  selector:
    app: node-exporter

After applying these manifests, both Prometheus and node‑exporter are running in the cluster.

3. Add monitoring targets to Prometheus configuration

The scrape_configs section is extended to scrape Prometheus itself and the node‑exporter pods using Kubernetes service discovery. The configuration uses kubernetes_sd_configs with role pod and relabel rules to select the correct pods and rewrite the address to :9100.

scrape_configs:
  # Collect Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Node Exporter
  - job_name: 'kubernetes-nodes'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        regex: node-exporter
        action: keep
      - source_labels: [__meta_kubernetes_pod_ip]
        target_label: __address__
        replacement: ${1}:9100
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)

Apply the updated ConfigMap and trigger a hot‑reload:

curl -X POST 172.22.0.3:30090/-/reload

4. Result verification

After reloading, the Prometheus UI shows that both the Prometheus server and the node‑exporter targets are up. Screenshots illustrate the successful collection of default metrics.

Prometheus UI
Prometheus UI
Node metrics
Node metrics

5. Deploy Grafana for dashboard visualization

Grafana is provisioned with two ConfigMaps: one for the Prometheus datasource and another for grafana.ini. The deployment runs on a specific node, mounts hostPath storage, and sets admin credentials via environment variables. The service is exposed as a NodePort (port 30300).

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-datasources
  namespace: monitoring
data:
  prometheus.yaml: |-
    apiVersion: 1
    datasources:
      - name: Prometheus
        type: prometheus
        access: proxy
        url: http://prometheus:9090
        isDefault: true
        editable: true
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-config
  namespace: monitoring
data:
  grafana.ini: |
    [paths]
    data = /var/lib/grafana
    logs = /var/log/grafana
    plugins = /var/lib/grafana/plugins
    [server]
    protocol = http
    http_port = 3000
    domain = localhost
    root_url = %(protocol)s://%(domain)s:%(http_port)s/
    [database]
    type = sqlite3
    path = /var/lib/grafana/grafana.db
    [security]
    admin_user = admin
    admin_password = admin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: monitoring
  labels:
    app: grafana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      nodeName: 192.168.90.5
      serviceAccountName: prometheus
      initContainers:
        - name: fix-permissions
          image: busybox:latest
          command: ["sh", "-c", "chmod -R 777 /var/lib/grafana"]
          volumeMounts:
            - name: storage
              mountPath: /var/lib/grafana
          securityContext:
            runAsUser: 0
      containers:
        - name: grafana
          image: m.daocloud.io/docker.io/grafana/grafana:10.4.0
          volumeMounts:
            - name: storage
              mountPath: /var/lib/grafana
            - name: datasources
              mountPath: /etc/grafana/provisioning/datasources
            - name: config
              mountPath: /etc/grafana/grafana.ini
              subPath: grafana.ini
          ports:
            - name: web
              containerPort: 3000
          resources:
            requests:
              cpu: 200m
              memory: 256Mi
            limits:
              cpu: 500m
              memory: 512Mi
          env:
            - name: GF_SECURITY_ADMIN_USER
              value: "admin"
            - name: GF_SECURITY_ADMIN_PASSWORD
              value: "admin"
      volumes:
        - name: storage
          hostPath:
            path: /app/ggrafana2
            type: DirectoryOrCreate
        - name: datasources
          configMap:
            name: grafana-datasources
        - name: config
          configMap:
            name: grafana-config
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
  labels:
    app: grafana
spec:
  type: NodePort
  ports:
    - name: web
      port: 3000
      targetPort: web
      nodePort: 30300
  selector:
    app: grafana

Apply the manifests and access Grafana at IP:30300. After logging in, the language profile can be switched to Chinese, and several ready‑made dashboards (template IDs 8919, 1860) are imported. Screenshots show the final dashboards.

Grafana login
Grafana login
Grafana dashboard
Grafana dashboard

Finally, the article notes that the same dashboards can be imported by downloading the official JSON files for offline environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MonitoringCloud NativeKubernetesPrometheusYAMLGrafanaNode Exporter
Linux Cloud-Native Ops Stack
Written by

Linux Cloud-Native Ops Stack

Focused on practical internet operations, sharing server monitoring, troubleshooting, automated deployment, and cloud-native tech insights. From Linux basics to advanced K8s, from ops tools to architecture optimization, helping engineers avoid pitfalls, grow quickly, and become your tech companion.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.