How to Build a Flink Monitoring System with Prometheus, Pushgateway, and Grafana
This guide walks you through configuring Flink metrics, installing and linking Pushgateway, Node_exporter, Prometheus, and Grafana, and finally visualizing and alerting on Flink metrics, providing a complete end‑to‑end monitoring solution for Flink clusters.
The previous conceptual and source‑code articles introduced the idea of Flink Metrics and showed how to add, delete, and pull them. This part demonstrates, step by step, how to build a monitoring system for Flink using Prometheus + Pushgateway + Grafana.
Because the Pushgateway receives metrics pushed by Flink, it avoids the need for Prometheus to scrape targets that may be on different subnets or behind firewalls. Flink uses PrometheusPushGatewayReporter to push metrics to the Pushgateway, and Prometheus then scrapes the Pushgateway for a unified view.
1. Configure Flink
Edit conf/flink-conf.yaml and add:
metrics.reporters: progateway
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: datanode01
metrics.reporter.promgateway.port: 9100
metrics.reporter.promgateway.jobName: flink-metrics2. Install Pushgateway
Download the appropriate version from https://prometheus.io/download/, extract it, and start it:
tar zxvf pushgateway-1.4.1.linux-amd64.tar.gz
./pushgateway &3. Install node_exporter
Start the service: ./node_exporter & Access http://localhost:9100/metrics to see host metrics.
4. Install Prometheus
Download, extract, and edit prometheus.yml to add the following scrape jobs:
- job_name: 'node_exporter'
static_configs:
- targets: ['localhost:9100']
labels:
instance: 'node_exporter'
- job_name: 'pushgateway'
static_configs:
- targets: ['localhost:9091']
labels:
instance: 'pushgateway'Start Prometheus: ./prometheus --config.file=prometheus.yml Visit http://localhost:9090/ to verify the service.
5. Install Grafana
Download, extract, and start Grafana:
tar -zxvf grafana-8.0.3.linux-amd64.tar.gz
./grafana-serverOpen http://localhost:3000 to access the Grafana UI.
6. Metrics Visualization
Start a Flink cluster: ./bin/start-cluster.sh Run a Flink SQL client and create a data‑generation table:
CREATE TABLE prometheusdatagen (
f_sequence INT,
f_random INT,
f_random_str STRING,
ts AS localtimestamp,
WATERMARK FOR ts AS ts
) WITH (
'connector' = 'datagen',
'rows-per-second'='5',
'fields.f_sequence.kind'='sequence',
'fields.f_sequence.start'='1',
'fields.f_sequence.end'='100000',
'fields.f_random.min'='1',
'fields.f_random.max'='1000',
'fields.f_random_str.length'='10'
);Query the table and observe the job in the Flink UI. Flink pushes its metrics to the Pushgateway, which Prometheus scrapes.
7. Add Prometheus Data Source in Grafana
Select Prometheus as the data source and configure the address and port.
8. Visualize node_exporter Metrics
Import the dashboard template ID 12884 to display node_exporter metrics.
9. Flink Metrics Dashboard
Choose the Metrics panel in Grafana and save the dashboard.
10. Alert Configuration
The entire monitoring pipeline—data collection, storage, visualization, and alerting—has been set up. For custom metrics, define them in the Flink job and follow the same steps to collect, store, display, and alert on them.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
