How to Configure Alertmanager, Add WeChat Alerts, and Enable Automatic Service Discovery in Kubernetes
This guide walks through modifying Alertmanager to use a NodePort service, decoding and editing its secret to add custom receivers and a WeChat template, recreating the secret, and extending Prometheus Operator with additional scrape configs for automatic service discovery, including RBAC adjustments and verification steps.
Configure Alertmanager Service
Change /root/kube-prometheus/manifests/alertmanager-service.yaml to set type: NodePort so the Alertmanager UI can be accessed from a browser. After applying the manifest, run kubectl get svc -n monitoring to see the NodePort address (e.g., http://172.16.0.6:31568).
Inspect and Decode the Alertmanager Secret
The configuration lives in the secret /root/kube-prometheus/manifests/alertmanager-secret.yaml named alertmanager-main. Extract the base64‑encoded alertmanager.yaml value and decode it:
echo "Imdsb2JhbCI6IAogICJyZXNvbHZlX3RpbWVvdXQiOiAiNW0iCiJyZWNlaXZlcnMiOiAKLSAibmFtZSI6ICJudWxsIgoicm91dGUiOiAKICAiZ3JvdXBfYnkiOiAKICAtICJqb2IiCiAgZ3JvdXBf..." | base64 -dCustomize Alertmanager Configuration
Edit the decoded alertmanager.yaml to define global settings, templates, routing, and receivers. Example snippet:
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.qiye.aliyun.com:465'
smtp_from: '[email protected]'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'aRXjq9W1jto^7^Zb'
smtp_require_tls: true
templates:
- "*.tmpl"
route:
group_by: ['job','severity']
group_wait: 30s
group_interval: 5m
repeat_interval: 5m
receiver: 'wechat'
routes:
- receiver: 'wechat'
group_wait: 10s
match:
alertname: EtcdClusterUnavailable
receivers:
- name: 'default'
email_configs:
- to: '[email protected]'
send_resolved: true
- name: 'wechat'
wechat_configs:
- corp_id: 'wx02f71fb3dea46c16'
to_party: '1'
to_user: "renzhenxin"
agent_id: '1'
api_secret: 'r4OGerF_p4UrIN6QERCefJRxzpI0SquNG5gHCxGxcOM'
send_resolved: true
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname','dev','instance']Create a WeChat Alert Template
Save the following as wechat.tmpl and reference it in the templates section above:
{{ define "wechat.default.message" }}
{{ range .Alerts }}
========start==========
触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
告警程序: prometheus_alert
告警级别: {{ .Labels.severity }}
告警类型: {{ .Labels.alertname }}
故障主机: {{ .Labels.instance }}
告警主题: {{ .Annotations.summary }}
告警详情: {{ .Annotations.description }}
========end==========
{{ end }}
{{ end }}Re‑create the Alertmanager Secret
Delete the existing secret and create a new one that includes both alertmanager.yaml and wechat.tmpl:
kubectl delete secret alertmanager-main -n monitoring
kubectl create secret generic alertmanager-main \
--from-file=alertmanager.yaml \
--from-file=wechat.tmpl -n monitoringVerify the changes by exec‑ing into the Alertmanager pod and inspecting /etc/alertmanager/config/wechat.tmpl and the status page.
Enable Automatic Service Discovery
Add a supplemental scrape configuration that selects Services with the annotation prometheus.io/scrape=true:
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_nameStore this file as prometheus-additional.yaml, then create a secret:
kubectl create secret generic additional-configs \
--from-file=prometheus-additional.yaml -n monitoringReference the Additional Config in the Prometheus CR
Edit the Prometheus custom resource ( prometheus-prometheus.yaml) and add under spec:
additionalScrapeConfigs:
name: additional-configs
key: prometheus-additional.yamlApply the updated manifest:
kubectl apply -f prometheus-prometheus.yamlAfter a short wait, check the Prometheus UI for the new job kubernetes-service-endpoints and verify that services such as kube-dns (port 9153) appear in the targets list.
Fix RBAC Permissions
If the Prometheus logs show xxx is forbidden, adjust the ClusterRole bound to the prometheus-k8s ServiceAccount. Example prometheus-clusterRole.yaml snippet:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-k8s
rules:
- apiGroups: [""]
resources: [nodes, services, endpoints, pods, nodes/proxy]
verbs: [get, list, watch]
- apiGroups: [""]
resources: [configmaps, nodes/metrics]
verbs: [get]
- nonResourceURLs: [/metrics]
verbs: [get]Apply the updated role:
kubectl apply -f prometheus-clusterRole.yamlNow the Prometheus pods can scrape the newly discovered services without permission errors.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
