How to Build a Flink Monitoring System with Prometheus, Pushgateway, and Grafana
This guide walks you through configuring Flink metrics, installing and linking Pushgateway, Node_exporter, Prometheus, and Grafana, and finally visualizing and alerting on Flink metrics, providing a complete end‑to‑end monitoring solution for Flink clusters.
The previous conceptual and source‑code articles introduced the idea of Flink Metrics and showed how to add, delete, and pull them. This part demonstrates, step by step, how to build a monitoring system for Flink using
Prometheus+
Pushgateway+
Grafana.
Because the Pushgateway receives metrics pushed by Flink, it avoids the need for Prometheus to scrape targets that may be on different subnets or behind firewalls. Flink uses
PrometheusPushGatewayReporterto push metrics to the Pushgateway, and Prometheus then scrapes the Pushgateway for a unified view.
1. Configure Flink
Edit
conf/flink-conf.yamland add:
<code>metrics.reporters: progateway
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: datanode01
metrics.reporter.promgateway.port: 9100
metrics.reporter.promgateway.jobName: flink-metrics</code>2. Install Pushgateway
Download the appropriate version from
https://prometheus.io/download/, extract it, and start it:
<code>tar zxvf pushgateway-1.4.1.linux-amd64.tar.gz
./pushgateway &</code>3. Install node_exporter
Start the service:
<code>./node_exporter &</code>Access
http://localhost:9100/metricsto see host metrics.
4. Install Prometheus
Download, extract, and edit
prometheus.ymlto add the following scrape jobs:
<code>- job_name: 'node_exporter'
static_configs:
- targets: ['localhost:9100']
labels:
instance: 'node_exporter'
- job_name: 'pushgateway'
static_configs:
- targets: ['localhost:9091']
labels:
instance: 'pushgateway'</code>Start Prometheus:
<code>./prometheus --config.file=prometheus.yml</code>Visit
http://localhost:9090/to verify the service.
5. Install Grafana
Download, extract, and start Grafana:
<code>tar -zxvf grafana-8.0.3.linux-amd64.tar.gz
./grafana-server</code>Open
http://localhost:3000to access the Grafana UI.
6. Metrics Visualization
Start a Flink cluster:
<code>./bin/start-cluster.sh</code>Run a Flink SQL client and create a data‑generation table:
<code>CREATE TABLE prometheusdatagen (
f_sequence INT,
f_random INT,
f_random_str STRING,
ts AS localtimestamp,
WATERMARK FOR ts AS ts
) WITH (
'connector' = 'datagen',
'rows-per-second'='5',
'fields.f_sequence.kind'='sequence',
'fields.f_sequence.start'='1',
'fields.f_sequence.end'='100000',
'fields.f_random.min'='1',
'fields.f_random.max'='1000',
'fields.f_random_str.length'='10'
);</code>Query the table and observe the job in the Flink UI. Flink pushes its metrics to the Pushgateway, which Prometheus scrapes.
7. Add Prometheus Data Source in Grafana
Select
Prometheusas the data source and configure the address and port.
8. Visualize node_exporter Metrics
Import the dashboard template ID
12884to display node_exporter metrics.
9. Flink Metrics Dashboard
Choose the
Metricspanel in Grafana and save the dashboard.
10. Alert Configuration
The entire monitoring pipeline—data collection, storage, visualization, and alerting—has been set up. For custom metrics, define them in the Flink job and follow the same steps to collect, store, display, and alert on them.
37 Mobile Game Tech Team
37 Mobile Game Tech Team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.