Design and Implementation of KDNS: A High‑Performance DPDK‑Based DNS Server
This article presents the design, DPDK‑based implementation, performance optimizations, and test results of KDNS—a C‑language DNS server that replaces ContainerDNS's Go server, achieving near line‑rate processing, high availability, and significant latency improvements over traditional bind9 solutions.
With the full deployment of the TIG Archimedes platform, the Go‑based ContainerDNS service (https://github.com/tiglabs/containerdns) is approaching its performance ceiling of about 170,000 QPS, prompting the need for further system‑level optimizations.
ContainerDNS is a distributed DNS service featuring multi‑data‑center replication, automatic service discovery, health checking, and easy dynamic scaling. Its architecture consists of four loosely coupled components—DNS server, service‑to‑DNS, user API, and IP‑status check—backed by an etcd cluster, allowing independent deployment of each module.
KDNS (DPDK DNS) is not a replacement but an evolution of ContainerDNS: the DNS server module is rewritten in C using the Data Plane Development Kit (DPDK) while reusing the existing health‑check, API, and Kubernetes monitoring modules. Deployed on physical machines with a virtual IP managed by quagga, KDNS supports multi‑active instances, hot upgrades, and dynamic rollbacks without service interruption.
The system assigns two types of CPU cores: a master core handling control‑plane tasks (domain updates, ARP/BGP forwarding via DPDK’s KNI interface) and slave cores dedicated to data‑plane packet I/O. DPDK provides lock‑free queues, RSS, huge pages, and zero‑copy packet buffers, enabling modules for packet reception/transmission, protocol parsing, forwarding, domain data handling, and ARP/BGP processing. A RESTful API and a Go‑based DNS‑agent keep domain data synchronized with etcd, and all data structures use hash tables for fast look‑ups.
Performance optimizations include the strategic use of rte_prefetch0(), branch prediction hints ( likely() / unlikely()), completely lock‑free data paths, and reuse of mbuf memory to eliminate packet copies.
Testing on an Intel Xeon E5‑2698 v4 with a 10‑GbE NIC compared KDNS against bind9 (16 cores). Results show KDNS achieving up to 200,000 QPS per instance, significantly lower minimum and average response times (e.g., 226 µs vs 1140 µs worst‑case for bind9), and stable performance across multiple domain queries, as illustrated in the accompanying charts.
In conclusion, KDNS delivers near line‑rate DNS processing, flexible deployment, comprehensive monitoring, and robust stability, serving as a critical component of JD’s data‑center operating system (JDOS) within the TIG Archimedes ecosystem.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
