Operations 12 min read

How to Monitor CoreDNS in Kubernetes with Prometheus: Key Metrics & Setup

Learn how to monitor Kubernetes CoreDNS using Prometheus by exposing its metrics endpoint, configuring scrape jobs, and tracking essential metrics such as build info, request latency, error rates, traffic volume, and cache performance to ensure DNS reliability and cluster health.

MaGe Linux Operations

Sep 30, 2023

How to Monitor CoreDNS in Kubernetes with Prometheus: Key Metrics & Setup

CoreDNS is the DNS add‑on for Kubernetes, running as a component on control‑plane nodes and essential for cluster operation.

What is Kubernetes CoreDNS?

Since Kubernetes 1.11, CoreDNS replaced kube‑dns as the default DNS service. It is written in Go and provides flexible DNS functionality for the cluster.

The older kube‑dns add‑on consisted of three containers: kubedns: the SkyDNS implementation that resolves DNS queries inside the cluster. dnsmasq: provides DNS caching for SkyDNS. sidecar: exports metrics and performs health checks for the DNS service.

CoreDNS consolidates these functions into a single container, fixing several issues of kube‑dns and exposing its metrics on port 9153.

How to monitor CoreDNS in Kubernetes?

CoreDNS exposes a /metrics endpoint on each pod. You can retrieve metrics with a simple curl command.

Manual endpoint access

# curl http://192.169.203.46:9153/metrics

You can also reach the endpoint via the default CoreDNS service:

# kubectl get svc -n kube-system
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   129d

# curl http://kube-dns.kube-system.svc:9153/metrics

Configure Prometheus to scrape CoreDNS metrics

Use the endpoints role in prometheus.yml to discover the CoreDNS service.

scrape_configs:
  - honor_labels: true
    job_name: kubernetes-service-endpoints
    kubernetes_sd_configs:
      - role: endpoints
    relabel_configs:
      - action: keep
        regex: true
        source_labels:
          - __meta_kubernetes_service_annotation_prometheus_io_scrape
      - action: drop
        regex: true
        source_labels:
          - __meta_kubernetes_service_annotation_prometheus_io_scrape_slow
      - action: replace
        regex: (https?)
        source_labels:
          - __meta_kubernetes_service_annotation_prometheus_io_scheme
        target_label: __scheme__
      - action: replace
        regex: (.+)
        source_labels:
          - __meta_kubernetes_service_annotation_prometheus_io_path
        target_label: __metrics_path__
      - action: replace
        regex: (.+?)(?::\d+)?;(\d+)
        replacement: $1:$2
        source_labels:
          - __address__
          - __meta_kubernetes_service_annotation_prometheus_io_port
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
        replacement: __param_$1
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - action: replace
        source_labels:
          - __meta_kubernetes_namespace
        target_label: namespace
      - action: replace
        source_labels:
          - __meta_kubernetes_service_name
        target_label: service
      - action: replace
        source_labels:
          - __meta_kubernetes_pod_node_name
        target_label: node

After redeploying the Prometheus pod, the CoreDNS metrics endpoint appears under Status → Targets in the Prometheus UI.

Which metrics should you check?

Note: Metrics may vary by Kubernetes and CoreDNS versions. The examples use Kubernetes 1.25 and CoreDNS 1.9.3.

CoreDNS replica count – monitor with coredns_build_info: count(coredns_build_info) Four key signal categories are recommended:

Errors

Track DNS response codes, especially SERVFAIL and REFUSED, using coredns_dns_responses_total:

sum(rate(coredns_dns_responses_total{instance=~".*"}[2m])) by (rcode, instance)

Latency

Measure request duration with coredns_dns_request_duration_seconds_bucket and compute the 99th percentile:

histogram_quantile(0.99, sum(rate(coredns_dns_request_duration_seconds_bucket{instance=~".*"}[2m])) by (server,zone,le,instance))

Traffic

Monitor total DNS request count with coredns_dns_requests_total, optionally separating A and AAAA queries:

(sum(rate(coredns_dns_requests_total{instance=~".*"}[2m])) by (type,instance))

Saturation

Observe resource usage (CPU, memory, network) of CoreDNS pods to detect saturation.

Other

CoreDNS implements a cache mechanism (default TTL 3600 s). Monitor cache hits with coredns_cache_hits_total:

sum(rate(coredns_cache_hits_total{instance=~".*"}[2m])) by (type,instance)

Conclusion

CoreDNS is the preferred DNS solution for Kubernetes clusters, offering flexibility and addressing many shortcomings of kube‑dns. Continuous monitoring of its metrics ensures DNS reliability, which is critical for the health of applications and the overall cluster.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring Kubernetes Prometheus DNS CoreDNS

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.