How Nacos’s Distro Protocol Ensures High Availability with AP Consistency
This article explains Nacos’s Distro consistency protocol, detailing its design principles, asynchronous replication, periodic synchronization, new‑node data loading, and local read mechanisms, and shows how these mechanisms together provide high‑availability AP consistency for service registration in a distributed cluster.
Introduction
Today we explore Nacos’s Distro consistency protocol, focusing on its architecture and how it achieves high availability through AP consistency.
When a service instance registers, the current Nacos node starts a 1‑second delayed task to synchronize data to other Nacos nodes (partition consistency). This is the core function of the self‑developed Distro protocol.
1. Distro Design Philosophy and Six Mechanisms
The Distro protocol is an optimized combination of Gossip and Eureka, designed to reduce message redundancy by assigning each node a portion of the data to synchronize.
Equality Mechanism : All Nacos nodes are peers and can handle write requests.
Asynchronous Replication Mechanism : Changed data is asynchronously replicated to other nodes (key focus).
Health Check Mechanism : Nodes periodically verify client status to maintain data consistency.
Local Read Mechanism : Each node serves read requests locally.
New Node Sync Mechanism : New nodes pull full data from existing nodes on startup.
Routing Forward Mechanism : Write requests are processed locally or forwarded to the appropriate node.
2. Asynchronous Replication: Syncing Data to Other Nodes
2.1 Core Entry
Source code path:
/naming/consistency/ephemeral/distro/DistroConsistencyServiceImpl.javaThe main entry method for registration is put(), which performs three actions:
Store instance info in an in‑memory ConcurrentHashMap.
Add a task to a BlockingQueue to push the latest instance list to clients via UDP.
Start a 1‑second delayed task to synchronize data to other Nacos nodes.
2.2 sync Method Parameters
sync()receives a DistroKey (containing the service name key and a constant), a data type ( change), and a delay time (1 s).
2.3 Adding Tasks to the Map
The sync logic iterates over other nodes, checks if a task already exists in a map, merges or adds it, and a background thread processes these tasks.
2.4 Background Thread Asynchronous Replication
The background thread repeatedly extracts tasks from the map, queues them, and workers send HTTP requests to other nodes with serialized instance data. Example request URL:
http://192.168.0.101:8858/nacos/v1/ns/distro/datum3. Periodic Synchronization: Maintaining Data Consistency
3.1 Need for Periodic Sync
In cluster mode, each Nacos node holds all client information, enabling any node to serve full registration data, which improves availability.
3.2 Metadata Check (v1)
Nodes periodically send a checksum request to peers; if mismatched, a full data pull is triggered.
http://<other-node-ip>:port/nacos/v1/ns/distro/checksum?source=<local-ip>:<port>3.3 Version Evolution
From v2 onward, the periodic check is replaced by the health‑check mechanism (covered in a future article).
4. New Node Sync Mechanism
When a new Distro node joins, it polls all existing nodes and pulls the full snapshot of non‑persistent instance data.
5. Local Read Mechanism
Each node can directly serve read requests from its local cache, ensuring fast responses and continued operation during network partitions.
Read operations are served locally without remote fetch.
In split‑brain scenarios, nodes still return data; eventual consistency is restored via health checks or metadata verification.
Conclusion
The Distro protocol combines six mechanisms—equality, asynchronous replication, health checks, local reads, new‑node sync, and routing—to provide AP‑style high availability for Nacos’s service registration, ensuring that the cluster remains functional even under node failures.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
