Operations 44 min read

Master DNS Operations: Deploy BIND & CoreDNS with Real‑World Troubleshooting

This guide walks you through DNS fundamentals, compares BIND, CoreDNS, PowerDNS and Unbound, provides step‑by‑step installation and configuration scripts for BIND 9 and CoreDNS on Linux and Kubernetes, explains caching, DNSSEC, security hardening, high‑availability designs, monitoring, backup and recovery, and shares best‑practice tips for production environments.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master DNS Operations: Deploy BIND & CoreDNS with Real‑World Troubleshooting

Overview

Domain Name System (DNS) is a critical Internet service that maps human‑readable domain names to IP addresses. A single DNS outage can render an entire business unavailable, so reliable deployment, security hardening, and observability are essential.

Technical comparison

BIND 9.20.x – Full‑stack authoritative and recursive server, native DNSSEC support, extensive documentation. Ideal for traditional data‑center environments.

CoreDNS 1.12.x – Cloud‑native DNS written in Go, uses a Caddy‑style Corefile for configuration, rich plugin ecosystem, default DNS for Kubernetes clusters.

PowerDNS 4.9.x – Authoritative server with database back‑ends (MySQL, PostgreSQL, LDAP). Suited for large‑scale zone management via API.

Unbound 1.22.x – Lightweight recursive resolver with a small memory footprint, perfect for caching or privacy‑focused DNS.

Deployment steps

BIND 9.20.x deployment and configuration

Installation (Ubuntu/Debian):

# Ubuntu / Debian
sudo apt update && sudo apt install -y bind9 bind9-utils bind9-dnsutils

# Verify version
named -v   # Expected output: BIND 9.20.x ...

Installation (CentOS/RHEL):

# CentOS / RHEL
sudo dnf install -y bind bind-utils

Global options ( /etc/bind/named.conf.options ) – key parameters for a production recursive resolver:

options {
    directory "/var/cache/bind";
    listen-on { 192.168.1.10; 127.0.0.1; };
    listen-on-v6 { none; };
    recursion yes;
    allow-recursion { 10.0.0.0/8; 172.16.0.0/12; 192.168.0.0/16; 127.0.0.1; };
    forwarders { 223.5.5.5; 119.29.29.29; };
    forward only;
    dnssec-validation auto;
    rate-limit { responses-per-second 10; window 5; slip 2; errors-per-second 5; nxdomains-per-second 5; log-only no; };
    version "not disclosed";
    hostname none;
    max-cache-size 512m;
    max-cache-ttl 3600;
    max-ncache-ttl 300;
    prefetch {2 9;};
    allow-transfer { none; };
    allow-query { any; };
};

Zone definitions ( /etc/bind/named.conf.local ) for a forward zone and a reverse zone:

include "/etc/bind/transfer.key";

zone "example.com" {
    type master;
    file "/var/lib/bind/db.example.com";
    allow-transfer { key "transfer-key"; };
    also-notify { 192.168.1.11; };
    notify yes;
    dnssec-policy default;
    inline-signing yes;
    key-directory "/var/lib/bind/keys/";
};

zone "1.168.192.in-addr.arpa" {
    type master;
    file "/var/lib/bind/db.192.168.1";
    allow-transfer { key "transfer-key"; };
    also-notify { 192.168.1.11; };
};

Sample forward zone file ( /var/lib/bind/db.example.com ) :

$TTL 3600
@   IN  SOA ns1.example.com. admin.example.com. (
        2026022601 ; serial (YYYYMMDDNN)
        3600       ; refresh
        900        ; retry
        604800     ; expire
        300        ; negative cache TTL
    )
    IN  NS  ns1.example.com.
    IN  NS  ns2.example.com.
ns1 IN  A   192.168.1.10
ns2 IN  A   192.168.1.11
@   IN  A   192.168.1.100
www IN  CNAME @
mail IN  A   192.168.1.20
api IN  A   192.168.1.101
db-master IN A 192.168.1.30
db-slave  IN A 192.168.1.31
@   IN  MX 10 mail.example.com.
@   IN  TXT "v=spf1 mx ip4:192.168.1.0/24 ~all"
_sip._tcp IN SRV 10 60 5060 sip.example.com.

Configuration validation – syntax and zone checks:

# Syntax check
sudo named-checkconf
# Zone check
sudo named-checkzone example.com /var/lib/bind/db.example.com
sudo named-checkzone 1.168.192.in-addr.arpa /var/lib/bind/db.192.168.1

Enable and start the service :

sudo systemctl enable --now named
sudo systemctl status named

CoreDNS 1.12.x deployment and configuration

Binary installation (Linux AMD64):

# Download CoreDNS 1.12.x
COREDNS_VERSION="1.12.0"
wget https://github.com/coredns/coredns/releases/download/v${COREDNS_VERSION}/coredns_${COREDNS_VERSION}_linux_amd64.tgz
 tar -xzf coredns_${COREDNS_VERSION}_linux_amd64.tgz
 sudo mv coredns /usr/local/bin/
 sudo chmod +x /usr/local/bin/coredns
 coredns -version

Corefile ( /etc/coredns/Corefile ) – internal zone, caching, forwarding to public resolvers, and Prometheus metrics:

# Internal zone
example.com {
    file /etc/coredns/zones/db.example.com
    log
    errors
    prometheus 0.0.0.0:9153
    cache 300 {
        success 9984 300
        denial 9984 60
    }
}

# Forward all other queries to upstream DNS over TLS
. {
    forward . tls://223.5.5.5 tls://223.6.6.6 {
        tls_servername dns.alidns.com
        health_check 5s
        policy round_robin
    }
    cache 600
    health 0.0.0.0:8080
    ready 0.0.0.0:8181
    prometheus :9153
    log
    errors
}

systemd service ( /etc/systemd/system/coredns.service ) :

[Unit]
Description=CoreDNS DNS Server
Documentation=https://coredns.io
After=network.target

[Service]
Type=simple
User=coredns
Group=coredns
ExecStart=/usr/local/bin/coredns -conf /etc/coredns/Corefile
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target

Kubernetes integration – ConfigMap that adds cluster DNS, internal corporate zone, and public DoT forwarding:

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    # Cluster DNS
    cluster.local:53 {
        errors
        health { lameduck 5s }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus :9153
        cache 30
        loop
        reload
        loadbalance
    }
    # Corporate internal zone
    example.com:53 {
        errors
        forward . 192.168.1.10 192.168.1.11 {
            policy round_robin
            health_check 5s
        }
        cache 120
        prometheus :9153
    }
    # Public DNS with DoT
    .:53 {
        errors
        forward . tls://223.5.5.5 tls://223.6.6.6 {
            tls_servername dns.alidns.com
            health_check 10s
            max_concurrent 2000
        }
        cache 600 {
            success 9984 600
            denial 9984 60
            serve_stale 1h
        }
        prometheus :9153
        loop
        reload
    }

DNSSEC configuration

DNSSEC adds cryptographic signatures to DNS data, creating a trust chain: root KSK → .com DS → example.com KSK → ZSK → signed records. BIND 9.20.x can perform automatic inline signing.

# /etc/bind/named.conf.local – enable automatic signing
zone "example.com" {
    type master;
    file "/var/lib/bind/db.example.com";
    dnssec-policy default;
    inline-signing yes;
    key-directory "/var/lib/bind/keys/";
};

# Custom policy example (ECDSA‑P256 keys, NSEC3 without iterations)
 dnssec-policy "corp-policy" {
    keys {
        ksk key-directory lifetime unlimited algorithm ecdsap256sha256;
        zsk key-directory lifetime 90d algorithm ecdsap256sha256;
    };
    nsec3param iterations 0 optout no salt-length 0;
 };

Verification of signatures:

# Verify DNSKEY set
 dig @192.168.1.10 example.com DNSKEY +dnssec
# Verify an A record with RRSIG
 dig @192.168.1.10 www.example.com A +dnssec
# Check RRSIG on the SOA record
 dig @192.168.1.10 example.com SOA +dnssec | grep RRSIG

Best practices and caveats

Separate recursive and authoritative servers in production to avoid cache pollution and to allow independent scaling.

Adopt a multi‑layer cache architecture: client → local cache (systemd‑resolved/dnsmasq) → zone cache (CoreDNS) → recursive resolver (BIND/Unbound) → authoritative server.

Restrict recursion to trusted networks using allow-recursion ACLs; enable rate-limit to mitigate DNS amplification attacks.

Maintain strictly increasing SOA serial numbers (format YYYYMMDDNN) to ensure successful zone transfers.

Use sensible TTLs (300–3600 s); avoid values below 60 s unless required for rapid failover.

Enable DNSSEC validation (default in BIND 9.20) and keep system time synchronized via NTP.

For high availability, consider BIND master/slave with a Keepalived VIP, CoreDNS multi‑replica Service in Kubernetes, or Anycast routing with BGP.

Fault diagnosis and monitoring

Log access

# Enable query logging in BIND
rndc querylog on
# Follow BIND logs
sudo journalctl -u named -f --no-pager
# CoreDNS logs (enable in Corefile with "log")
kubectl -n kube-system logs -l k8s-app=kube-dns -f --tail=100

Common issues

SERVFAIL – check upstream reachability, DNSSEC validation status, or zone file syntax.

REFUSED – client IP not covered by allow-recursion ACL.

Zone transfer failures – mismatched TSIG keys, missing allow-transfer permission, or firewall blocks on TCP 53.

DNSSEC failures – expired signatures, missing trust anchors, or unsynchronized clocks.

Cache staleness – flush with rndc flush or adjust max-ncache-ttl and serve_stale settings.

Performance monitoring

Expose BIND statistics via statistics-channels { inet 127.0.0.1 port 8053; }; and scrape with bind_exporter (Prometheus).

CoreDNS ships native Prometheus metrics on the prometheus plugin port (default :9153).

Key alerts: QPS > 80 % of capacity, cache hit rate < 70 %, SERVFAIL ratio > 1 %, zone‑transfer failures, recursive timeout > 2 %.

Backup and recovery

Backup script (BIND) – creates a timestamped archive of configuration, keys, and zone files, records current SOA serials, and retains 30 days of backups:

#!/bin/bash
set -euo pipefail

BACKUP_DIR="/data/backup/dns"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_PATH="${BACKUP_DIR}/${DATE}"
RETAIN_DAYS=30

mkdir -p "${BACKUP_PATH}"

# Config and keys
cp -a /etc/named.conf "${BACKUP_PATH}/"
cp -a /etc/named/ "${BACKUP_PATH}/named-etc/"
cp -a /var/named/zones/ "${BACKUP_PATH}/zones/"
cp -a /etc/named/keys/ "${BACKUP_PATH}/keys/"

# Freeze dynamic zones, copy, then thaw
rndc freeze
cp -a /var/named/dynamic/ "${BACKUP_PATH}/dynamic/" 2>/dev/null || true
rndc thaw

# Record SOA serials for verification
for zone_file in ${BACKUP_PATH}/zones/*.zone; do
    zone_name=$(basename "${zone_file}" .zone)
    dig @127.0.0.1 "${zone_name}" SOA +short >> "${BACKUP_PATH}/soa-serials.txt"
 done

# Create compressed archive
tar -czf "${BACKUP_DIR}/dns-backup-${DATE}.tar.gz" -C "${BACKUP_DIR}" "${DATE}"
rm -rf "${BACKUP_PATH}"
find "${BACKUP_DIR}" -name "dns-backup-*.tar.gz" -mtime +${RETAIN_DAYS} -delete

echo "Backup completed: ${BACKUP_DIR}/dns-backup-${DATE}.tar.gz"

Recovery procedure – restore the latest backup, verify syntax, fix permissions, and restart the service:

# Stop the failed node
sudo systemctl stop named

# Extract the latest backup
BACKUP_FILE="/data/backup/dns/dns-backup-20250126_020000.tar.gz"
 tar -xzf "${BACKUP_FILE}" -C /tmp/dns-restore/

# Restore configuration and zones
sudo cp -a /tmp/dns-restore/named-etc/* /etc/named/
sudo cp -a /tmp/dns-restore/zones/* /var/named/zones/
sudo cp -a /tmp/dns-restore/keys/* /etc/named/keys/

# Validate syntax
named-checkconf /etc/named.conf
for zone_file in /var/named/zones/*.zone; do
    zone_name=$(basename "${zone_file}" .zone)
    named-checkzone "${zone_name}" "${zone_file}"
 done

# Fix permissions and start service
sudo chown -R named:named /var/named/zones/ /etc/named/keys/
 sudo systemctl start named

# Verify
 dig @127.0.0.1 example.com A +short
 rndc status

Summary

The article demonstrates a production‑grade DNS architecture that combines BIND for authoritative services and recursive resolution with CoreDNS for cloud‑native caching and Kubernetes integration. It covers installation, configuration, DNSSEC signing, security hardening, performance tuning, monitoring, backup, and disaster recovery, providing a complete reference for reliable DNS operations.

DevOpsDNSBINDCoreDNS
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.