Master Prometheus Monitoring for Big Data on Kubernetes: Design & Alerting
This article explains how to design and implement a Prometheus‑based monitoring system for big‑data components running on Kubernetes, covering metric exposure methods, scrape configurations, exporter deployment, and dynamic alert rule management with Alertmanager.
Design Overview
The monitoring system for big‑data platforms must reliably scrape exposed metrics, analyze them, and generate alerts. Key questions include what to monitor, how metrics are exposed, how Prometheus scrapes them, and how alert rules are dynamically configured.
Monitoring Targets
All big‑data components run as pods in a Kubernetes cluster.
Metric Exposure Methods
Directly expose Prometheus metrics (pull).
Push metrics to a pushgateway (push).
Use a custom exporter to convert other formats to Prometheus‑compatible metrics.
Some components, such as Flink on YARN, run inside YARN containers and therefore require the pushgateway approach; short‑lived components are also recommended to push metrics.
Scrape Configuration
Prometheus always pulls metrics from targets. Common scrape jobs include:
Native Job configuration. PodMonitor (via Prometheus Operator) for pod‑level metrics. ServiceMonitor (via Prometheus Operator) for service‑level metrics.
When running on Kubernetes, PodMonitor is usually the simplest choice.
annotations:
prometheus.io/scrape: "true"
prometheus.io/scheme: "http"
prometheus.io/path: "/metrics"
prometheus.io/port: "19091"The main selectors in prometheus-prometheus.yaml are serviceMonitorSelector, podMonitorSelector, ruleSelector, and alertmanagers. The kubernetes_sd_config with relabeling can discover pods dynamically and rewrite labels before scraping.
labels:
bigData.metrics.object: pod
annotations:
bigData.metrics/scrape: "true"
bigData.metrics/scheme: "https"
bigData.metrics/path: "/jmx"
bigData.metrics/port: "29871"
bigData.metrics/role: "hdfs-nn,common"Alert Design
Alert Flow
Service experiences an abnormal condition.
Prometheus generates an alert.
Alertmanager receives the alert.
Alertmanager processes the alert according to configured routing, grouping, and inhibition rules, then forwards it (e.g., via webhook, SMS, email).
Dynamic Alert Configuration
Alerting consists of two parts: alertmanager: handling strategy (receivers, routing). alertRule: concrete alert expressions.
Alertmanager Example
global:
resolve_timeout: 5m
receivers:
- name: 'default'
- name: 'test.web.hook'
webhook_configs:
- url: 'http://alert-url'
route:
receiver: 'default'
group_wait: 30s
group_interval: 5m
repeat_interval: 2h
group_by: [groupId,instanceId]
routes:
- receiver: 'test.web.hook'
continue: true
match:
groupId: node-disk-usage
- receiver: 'test.web.hook'
continue: true
match:
groupId: kafka-topic-highstoreAlertRule Example – Disk Usage
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: node-disk-usage
namespace: monitoring
spec:
groups:
- name: node-disk-usage
rules:
- alert: node-disk-usage
expr: 100*(1-node_filesystem_avail_bytes{mountpoint="${path}"}/node_filesystem_size_bytes{mountpoint="${path}"}) > ${thresholdValue}
for: 1m
labels:
groupId: node-disk-usage
userIds: super
receivers: SMS
annotations:
title: "Disk warning: node {{$labels.instance}} ${path} usage {{$value}}%"
content: "Disk warning: node {{$labels.instance}} ${path} usage {{$value}}%"AlertRule Example – Kafka Lag
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: kafka-topic-highstore-${uniqueName}
namespace: monitoring
spec:
groups:
- name: kafka-topic-highstore
rules:
- alert: kafka-topic-highstore-${uniqueName}
expr: sum(kafka_consumergroup_lag{exporterType="kafka",consumergroup="${consumergroup}"}) > ${thresholdValue}
for: 1m
labels:
groupId: kafka-topic-highstore
instanceId: ${uniqueName}
userIds: super
receivers: SMS
annotations:
title: "KAFKA warning: consumer group ${consumergroup} lag {{$value}}"
content: "KAFKA warning: consumer group ${consumergroup} lag {{$value}}"Alert Timing Example
Two nodes (node1, node2) are monitored for disk usage. Alerts are grouped by groupId, causing repeated alerts to follow group_wait, group_interval, and repeat_interval semantics.
for : duration a metric must be abnormal before the alert fires.
group_wait : initial wait after a new group is created.
group_interval : interval between alerts when the group composition changes.
repeat_interval : interval between identical alerts when the group does not change (including recovery alerts).
Exporter Deployment
Exporters can run as sidecars (1:1 with the target pod) or as independent services (1:1 or 1:many). Sidecars bind the exporter lifecycle to the target, while independent deployments reduce coupling and are more flexible for multi‑node services such as Kafka.
Additional Tools
Use promtool to validate metric formats (e.g., ensure metric names and label names contain no dots). Port‑forwarding can expose Prometheus, Grafana, and Alertmanager for external access:
# Prometheus UI
nohup kubectl port-forward --address 0.0.0.0 service/prometheus-k8s 19090:9090 -n monitoring &
# Grafana UI
nohup kubectl port-forward --address 0.0.0.0 service/grafana 13000:3000 -n monitoring &
# Alertmanager UI
nohup kubectl port-forward --address 0.0.0.0 service/alertmanager-main 9093:9093 -n monitoring &Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
