Service Registry Guide: Concepts, Features, and Choosing Zookeeper vs Nacos
This article explains what a service registry is, outlines its essential capabilities such as high availability, horizontal scaling, health checking, routing, and multi‑datacenter support, and compares popular open‑source solutions like Zookeeper and Nacos to help you select the right one for your stack.
Concept
What is a service registry? In typical RPC frameworks there are three roles: provider (service provider), consumer (service consumer/caller), and registry (service registry) which enables consumers to discover providers.
provider – service provider
consumer – service consumer, the caller
registry – the key for consumer to discover provider
A registry must support service registration and deregistration for providers, and query and change‑notification for consumers. It should also address high availability, high performance, horizontal scalability, health checking, routing, and multi‑datacenter support.
Feature Details
Storage
A registry can be seen as a storage system mapping service names to provider endpoints. DNS is a widely used registry. Implementations can even be built on a database.
High Availability
Understanding the distributed CAP theorem is essential. Registries require especially high availability: they must be deployed in a cluster with no single point of failure, and the failure of the whole cluster should not affect existing service calls, though the routing may not update instantly.
In CAP terms, a registry ideally follows the AP model. Two scenarios illustrate this:
If a registry node fails, an AP system continues operating, whereas a CP system (e.g., ZAB, Raft) may become unavailable during leader election.
In a split‑brain situation, a CP cluster may lose a majority of nodes and become unusable, while an AP cluster can sacrifice consistency to keep nodes reachable.
Horizontal Scaling
Horizontal scaling means adding machines to handle increased load. AP systems are generally easier to scale than CP systems, where writes are limited to a leader node.
Health Checking
Service registration can be at the service level or the application level. Service‑level registration maps interfaces to IP/port, while application‑level registration maps applications to IP/port. Health checking methods differ:
Service‑level health checking often requires the RPC framework to expose an endpoint or rely on client‑reported heartbeats.
Application‑level health checking can simply verify that a port is open.
Client‑reported heartbeats give clear liveness information, but server‑initiated checks may miss failures when a port remains open but the service is dead. Nacos provides extensible health‑checking mechanisms.
Routing
Routing is optional but powerful. It enables scenarios such as separating traffic between environments (development, testing, pre‑release, production) without deploying multiple registries, and supports multi‑datacenter proximity routing and other advanced use cases.
Multi‑Datacenter Support
For multi‑datacenter deployments, CP consistency can simplify coordination, but routing can also address proximity calls. Deploying a single global registry requires cross‑datacenter synchronization, which adds latency; deploying per‑datacenter registries avoids latency but requires data sync mechanisms.
How to Choose a Registry
When evaluating open‑source solutions, consider compatibility with your technology stack (e.g., Dubbo support), performance and scalability, routing capabilities, health‑checking features, and multi‑datacenter support.
Common options include Zookeeper, Nacos, Eureka, Consul, and custom solutions such as Ant Group’s Sofa‑Registry, Ele.me’s Huskar, Meituan’s MNS, and Youzan’s Haunt.
Zookeeper
Zookeeper is the most widely used registry in Dubbo, but it was not designed for service discovery. It implements the ZAB protocol, providing strong consistency (CP) over a private TCP protocol. Writes are leader‑only, making horizontal scaling difficult and lacking routing or multi‑datacenter features.
Data is organized like a file‑system tree with temporary and persistent nodes; temporary nodes are removed when the client session ends.
One‑sentence summary: the primary choice for Dubbo in small‑to‑medium enterprises.
Nacos
Nacos, an open‑source registry and configuration center from Alibaba, uses its own AP‑oriented “distro” protocol. It can be accessed via HTTP or DNS, with future gRPC support. It scales horizontally, though current performance is moderate; long‑connection support is planned.
Multi‑datacenter support is achievable through its CMDB module, and it offers weight‑based load balancing, protection thresholds, and other strategies.
The community provides a nacosSync tool for migrating other registries to Nacos.
One‑sentence summary: richer features than Zookeeper, but newer with less proven stability and performance.
References
“Why Alibaba Does Not Use Zookeeper for Service Discovery” – http://jm.taobao.org/2018/06/13/做服务发现?/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xiao Lou's Tech Notes
Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
