Cloud Native 10 min read

How to Configure Alertmanager, Add WeChat Alerts, and Enable Automatic Service Discovery in Kubernetes

This guide walks through modifying Alertmanager to use a NodePort service, decoding and editing its secret to add custom receivers and a WeChat template, recreating the secret, and extending Prometheus Operator with additional scrape configs for automatic service discovery, including RBAC adjustments and verification steps.

Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
How to Configure Alertmanager, Add WeChat Alerts, and Enable Automatic Service Discovery in Kubernetes

Configure Alertmanager Service

Change /root/kube-prometheus/manifests/alertmanager-service.yaml to set type: NodePort so the Alertmanager UI can be accessed from a browser. After applying the manifest, run kubectl get svc -n monitoring to see the NodePort address (e.g., http://172.16.0.6:31568).

Inspect and Decode the Alertmanager Secret

The configuration lives in the secret /root/kube-prometheus/manifests/alertmanager-secret.yaml named alertmanager-main. Extract the base64‑encoded alertmanager.yaml value and decode it:

echo "Imdsb2JhbCI6IAogICJyZXNvbHZlX3RpbWVvdXQiOiAiNW0iCiJyZWNlaXZlcnMiOiAKLSAibmFtZSI6ICJudWxsIgoicm91dGUiOiAKICAiZ3JvdXBfYnkiOiAKICAtICJqb2IiCiAgZ3JvdXBf..." | base64 -d

Customize Alertmanager Configuration

Edit the decoded alertmanager.yaml to define global settings, templates, routing, and receivers. Example snippet:

global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.qiye.aliyun.com:465'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'aRXjq9W1jto^7^Zb'
  smtp_require_tls: true

templates:
- "*.tmpl"

route:
  group_by: ['job','severity']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 5m
  receiver: 'wechat'
  routes:
  - receiver: 'wechat'
    group_wait: 10s
    match:
      alertname: EtcdClusterUnavailable

receivers:
- name: 'default'
  email_configs:
  - to: '[email protected]'
    send_resolved: true
- name: 'wechat'
  wechat_configs:
  - corp_id: 'wx02f71fb3dea46c16'
    to_party: '1'
    to_user: "renzhenxin"
    agent_id: '1'
    api_secret: 'r4OGerF_p4UrIN6QERCefJRxzpI0SquNG5gHCxGxcOM'
    send_resolved: true

inhibit_rules:
- source_match:
    severity: 'critical'
  target_match:
    severity: 'warning'
  equal: ['alertname','dev','instance']

Create a WeChat Alert Template

Save the following as wechat.tmpl and reference it in the templates section above:

{{ define "wechat.default.message" }}
{{ range .Alerts }}
========start==========
触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
告警程序: prometheus_alert
告警级别: {{ .Labels.severity }}
告警类型: {{ .Labels.alertname }}
故障主机: {{ .Labels.instance }}
告警主题: {{ .Annotations.summary }}
告警详情: {{ .Annotations.description }}
========end==========
{{ end }}
{{ end }}

Re‑create the Alertmanager Secret

Delete the existing secret and create a new one that includes both alertmanager.yaml and wechat.tmpl:

kubectl delete secret alertmanager-main -n monitoring
kubectl create secret generic alertmanager-main \
  --from-file=alertmanager.yaml \
  --from-file=wechat.tmpl -n monitoring

Verify the changes by exec‑ing into the Alertmanager pod and inspecting /etc/alertmanager/config/wechat.tmpl and the status page.

Enable Automatic Service Discovery

Add a supplemental scrape configuration that selects Services with the annotation prometheus.io/scrape=true:

- job_name: 'kubernetes-service-endpoints'
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
    action: replace
    target_label: __scheme__
    regex: (https?)
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name

Store this file as prometheus-additional.yaml, then create a secret:

kubectl create secret generic additional-configs \
  --from-file=prometheus-additional.yaml -n monitoring

Reference the Additional Config in the Prometheus CR

Edit the Prometheus custom resource ( prometheus-prometheus.yaml) and add under spec:

additionalScrapeConfigs:
  name: additional-configs
  key: prometheus-additional.yaml

Apply the updated manifest:

kubectl apply -f prometheus-prometheus.yaml

After a short wait, check the Prometheus UI for the new job kubernetes-service-endpoints and verify that services such as kube-dns (port 9153) appear in the targets list.

Fix RBAC Permissions

If the Prometheus logs show xxx is forbidden, adjust the ClusterRole bound to the prometheus-k8s ServiceAccount. Example prometheus-clusterRole.yaml snippet:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups: [""]
  resources: [nodes, services, endpoints, pods, nodes/proxy]
  verbs: [get, list, watch]
- apiGroups: [""]
  resources: [configmaps, nodes/metrics]
  verbs: [get]
- nonResourceURLs: [/metrics]
  verbs: [get]

Apply the updated role:

kubectl apply -f prometheus-clusterRole.yaml

Now the Prometheus pods can scrape the newly discovered services without permission errors.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringKubernetesRBACServiceDiscovery
Full-Stack DevOps & Kubernetes
Written by

Full-Stack DevOps & Kubernetes

Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.