Operations 11 min read

Master Blackbox Exporter: Install, Configure, and Alert with Prometheus

This guide walks through the concepts of white‑box vs black‑box monitoring, explains Prometheus Blackbox Exporter capabilities, shows step‑by‑step installation, Kubernetes configuration, probe definitions for HTTP, TCP, ICMP and SSL, and provides ready‑to‑use alert rules and Grafana dashboard integration.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Master Blackbox Exporter: Install, Configure, and Alert with Prometheus

Overview

In monitoring systems we usually distinguish white‑box monitoring, which focuses on internal metrics, and black‑box monitoring, which observes external symptoms such as alerts or failed endpoints.

Black‑box monitoring concentrates on observable phenomena (e.g., an alarm or a non‑responsive business interface) from the user’s perspective, aiming to alert on ongoing failures.

White‑box monitoring concentrates on internal indicators (e.g., Redis info showing a slave down) to diagnose root causes of the failures observed by black‑box probes.

Blackbox Exporter

Blackbox Exporter is the official Prometheus solution for black‑box monitoring, allowing probes via HTTP, HTTPS, DNS, TCP and ICMP.

1. HTTP probe

Define request headers

Validate HTTP status, response headers and body

2. TCP probe

Port status listening

Application‑layer protocol definition and listening

3. ICMP probe

Host reachability (ping)

4. POST probe

Endpoint connectivity

5. SSL certificate expiration

Blackbox Exporter can also retrieve SSL certificate expiry information.

Install Blackbox Exporter

(1) Create a YAML manifest (blackbox-deployment.yaml):

<code>apiVersion: v1
kind: Service
metadata:
  name: blackbox
  namespace: monitoring
  labels:
    app: blackbox
spec:
  selector:
    app: blackbox
  ports:
  - port: 9115
    targetPort: 9115
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: blackbox-config
  namespace: monitoring
data:
  blackbox.yaml: |-
    modules:
      http_2xx:
        prober: http
        timeout: 10s
        http:
          valid_http_versions: ["HTTP/1.1","HTTP/2"]
          valid_status_codes: [200]
          method: GET
          preferred_ip_protocol: "ip4"
      http_post_2xx:
        prober: http
        timeout: 10s
        http:
          valid_http_versions: ["HTTP/1.1","HTTP/2"]
          valid_status_codes: [200]
          method: POST
          preferred_ip_protocol: "ip4"
      tcp_connect:
        prober: tcp
        timeout: 10s
      ping:
        prober: icmp
        timeout: 5s
        icmp:
          preferred_ip_protocol: "ip4"
      dns:
        prober: dns
        dns:
          transport_protocol: "tcp"
          preferred_ip_protocol: "ip4"
          query_name: "kubernetes.default.svc.cluster.local"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: blackbox
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: blackbox
  template:
    metadata:
      labels:
        app: blackbox
    spec:
      containers:
      - name: blackbox
        image: prom/blackbox-exporter:v0.18.0
        args:
        - "--config.file=/etc/blackbox_exporter/blackbox.yaml"
        - "--log.level=error"
        ports:
        - containerPort: 9115
        volumeMounts:
        - name: config
          mountPath: /etc/blackbox_exporter
      volumes:
      - name: config
        configMap:
          name: blackbox-config</code>

(2) Apply the manifest:

<code>kubectl apply -f blackbox-deployment.yaml</code>

Configure Monitoring

Because the cluster uses Prometheus Operator, additional scrape configurations are added via a secret.

(1) Create

prometheus-additional.yaml

with jobs for HTTP, DNS, ICMP, etc.

<code>- job_name: "ingress-endpoint-status"
  metrics_path: /probe
  params:
    module: [http_2xx]  # Expect HTTP 200
  static_configs:
  - targets:
    - http://172.17.100.134/healthz
    labels:
      group: nginx-ingress
  relabel_configs:
  - source_labels: [__address__]
    target_label: __param_target
  - source_labels: [__param_target]
    target_label: instance
  - target_label: __address__
    replacement: blackbox.monitoring:9115
- job_name: "kubernetes-service-dns"
  metrics_path: /probe
  params:
    module: [dns]
  static_configs:
  - targets:
    - kube-dns.kube-system:53
  relabel_configs:
  - source_labels: [__address__]
    target_label: __param_target
  - source_labels: [__param_target]
    target_label: instance
  - target_label: __address__
    replacement: blackbox.monitoring:9115
- job_name: "node-icmp-status"
  ... (other jobs for ICMP, TCP, etc.)</code>

(2) Create the secret containing the additional configuration:

<code>kubectl -n monitoring create secret generic additional-config --from-file=prometheus-additional.yaml</code>

(3) Edit

prometheus-prometheus.yaml

to reference the secret:

<code>additionalScrapeConfigs:
  name: additional-config
  key: prometheus-additional.yaml</code>

(4) Re‑apply the Prometheus custom resource and reload the server:

<code>kubectl apply -f prometheus-prometheus.yaml
curl -X POST "http://<PROMETHEUS_IP>:9090/-/reload"</code>

ICMP Monitoring

Ping targets are defined in a job named "node-icmp-status". After reloading, the targets appear in the Prometheus UI.

HTTP Monitoring

GET probes are defined for URLs such as https://www.coolops.cn and https://www.baidu.com. After reload, the status is visible in Prometheus.

TCP Monitoring

TCP probes check ports of middleware services (e.g., 172.17.100.135:80, 172.17.100.74:3306). Results are shown after reloading.

Alert Rules

Business health: monitor

probe_success

(0 = failure, 1 = success).

SSL certificate expiration:

probe_ssl_earliest_cert_expiry

can be used to alert when less than 30 days remain.

<code>groups:
- name: blackbox_network_stats
  rules:
  - alert: blackbox_network_stats
    expr: probe_success == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Interface/host/port connectivity failure"
      description: "Interface/host/port {{ $labels.instance }} connectivity abnormal"
- name: check_ssl_status
  rules:
  - alert: "SSL certificate expiration warning"
    expr: (probe_ssl_earliest_cert_expiry - time())/86400 < 30
    for: 1h
    labels:
      severity: warn
    annotations:
      description: "Domain {{ $labels.instance }} certificate expires in {{ printf \"%.1f\" $value }} days"
      summary: "SSL certificate expiration warning"</code>

Grafana Dashboard

Import dashboard ID 12559 to visualize the blackbox metrics.

monitoringKubernetesAlertingPrometheusBlackbox Exporter
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.