Cloud Native 18 min read

Why Your Kubernetes Pod Can't Reach the Server: DNS Search Domain Pitfalls and Fixes

An agent service running in a Kubernetes pod appeared healthy but failed to receive heartbeats due to DNS resolution errors caused by an unintended 'HOST' search domain, leading to incorrect IP resolution; the article details the investigation, explains Kubernetes DNS mechanics, and shows how adjusting ndots or using fully qualified names resolves the issue.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Why Your Kubernetes Pod Can't Reach the Server: DNS Search Domain Pitfalls and Fixes

1. Fault Phenomenon

We deployed an agent service to a Kubernetes cluster; the pod status is Running, but the server never receives heartbeat signals. Logs show many "tcp timeout" messages when trying to connect to a specific IP address.

2. Fault Investigation Process

Log analysis revealed numerous I/O timeout errors when connecting to the IP. The service only contacts the server, whose domain name is set via an environment variable. The server is reachable from the host and the node, so the issue is not the server itself.

Testing from inside the pod shows the server cannot be reached, even though ping resolves a domain. The resolved IP is not the server's external IP, indicating a DNS resolution problem. Using nslookup (after installing dnsutils or bind-utils) shows an odd name with a trailing HOST suffix.

Inspecting /etc/resolv.conf reveals a search domain that includes HOST , causing every DNS query to append this suffix.

Any domain ending with HOST resolves to an unexpected IP because HOST is a top‑level domain that performs wildcard resolution.

3. Fault Cause Analysis

Understanding how Kubernetes resolves service names is essential. Inside a pod, DNS queries are sent to the cluster’s kube-dns (or coredns) service IP (e.g., 10.68.0.2) as defined in /etc/resolv.conf. The file typically contains:

nameserver 10.68.0.2
search devops.svc.cluster.local. svc.cluster.local. cluster.local.
options ndots:5

For intra‑namespace service calls, a simple name like b is expanded using the search list, eventually forming b.devops.svc.cluster.local. For external domains, the same search list is applied unless the ndots threshold is met.

When a domain has fewer than five dots, the resolver appends each search suffix in turn, generating multiple DNS queries. This was demonstrated by capturing packets with tcpdump for baidu.com and for a short‑dot domain a.b.c.d.com. The captures show three DNS lookups for the short‑dot case, while a long‑dot domain (≥5 dots) is queried directly as an absolute name.

// Example of /etc/resolv.conf
nameserver 10.68.0.2
search devops.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

Two optimization strategies are presented:

Optimization 1: Use Fully Qualified Domain Names

Appending a trailing dot (e.g., a.b.c.com.) forces the resolver to treat the name as absolute, bypassing the search list and eliminating extra DNS queries.

nslookup a.b.c.com.

Optimization 2: Adjust ndots Value

Changing ndots from the default 5 to a lower value (e.g., 2) reduces the number of times the search list is applied for short domain names. This can be done per‑deployment via dnsConfig:

spec:
  containers:
  - name: srv-inner-proxy
    image: xxx/devops/srv-inner-proxy
    ...
  dnsConfig:
    options:
    - name: ndots
      value: "2"
  dnsPolicy: ClusterFirst

Kubernetes supports four DNS policies for pods:

None – no DNS configuration (used with custom dnsConfig).

Default – lets kubelet decide, typically using the node’s /etc/resolv.conf.

ClusterFirst – pods use the cluster’s DNS service first, falling back to the node’s DNS if needed.

ClusterFirstWithHostNet – for host‑networked pods, still use the cluster DNS.

4. Conclusion

By setting dnsPolicy in the deployment and lowering ndots to 2, the pod’s DNS resolution bypasses the problematic HOST search domain and correctly resolves the server’s IP address. The case highlights the importance of understanding Kubernetes DNS internals when troubleshooting connectivity issues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud-nativeKubernetesDNSPoddnsPolicyndots
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.