Backend Development 28 min read

How Nacos Implements Service Registration: From Ephemeral Instances to CP/AP Consistency

This article deep‑dives into Nacos as a service registry, explaining the differences between temporary and permanent instances, registration mechanisms across 1.x and 2.x versions, heartbeat and health‑check strategies, service discovery methods, data‑consistency models, and the underlying data model that powers Nacos clusters.

Su San Talks Tech

Jan 3, 2024

How Nacos Implements Service Registration: From Ephemeral Instances to CP/AP Consistency

Temporary and Permanent Instances

In Nacos, a temporary instance is stored only in the server’s in‑memory registry and is removed when the instance goes offline, while a permanent instance is also persisted to disk and remains visible (marked unhealthy) after a failure.

Beyond the basic distinction, many other differences exist and are discussed later.

Temporary instances suit typical business services; permanent instances are used for infrastructure services such as MySQL or Redis that require continuous visibility.

spring
  cloud:
    nacos:
      discovery:
        # set false to make the instance permanent
        ephemeral: false

In version 1.x a service can contain both temporary and permanent instances, decided per instance. In version 2.x the whole service is either temporary or permanent, decided by the service definition.

Service Registration

Registration sends instance metadata (IP, port, etc.) to the server.

1.x Implementation

Uses HTTP APIs; the server stores the instance in the in‑memory registry.

2.x Implementation

Communication Protocol Change

Switches from HTTP to gRPC, establishing a long‑lived connection for registration and other operations, improving performance by at least 2×.

gRPC is a high‑performance open‑source RPC framework built on Netty.

Specific Implementation

The client opens a gRPC long connection at startup, sends registration data over it, and the server stores the instance in the registry just like in 1.x.

For temporary instances, the client also caches the instance locally for redo operations, which periodically retry registration after reconnection.

Redo also applies to subscription updates after a reconnection.

Heartbeat Mechanism

Only temporary instances rely on heartbeats to stay alive.

1.x Heartbeat

The client runs a 5‑second timer sending HTTP heartbeat requests; the server runs a 5‑second timer checking the last heartbeat time, marking instances unhealthy after 15 s and removing them after 30 s.

If the last heartbeat exceeds 15 s but not 30 s, the instance is marked unhealthy.

If it exceeds 30 s, the instance is removed.

2.x Heartbeat

Relies on the gRPC connection’s built‑in keep‑alive; if the connection drops, the server removes the instance. Additionally, the server runs a 3‑second timer checking connections that have been silent for over 20 s and proactively closes them.

Connection‑level heartbeat removes the instance on disconnect.

Server‑side idle check removes instances after 20 s of inactivity.

Health Check

Permanent instances cannot send heartbeats, so the server actively probes them using TCP, HTTP, or MySQL checks (default is TCP). Successful probes mark the instance healthy.

Service Discovery

Clients can discover services via active queries (pull) or subscriptions (push).

Active query: client requests instance list.

Subscription: server pushes updates when instances change.

1.x Subscription

Client creates a UDP socket, registers the port with the server, caches instance data locally, and runs a periodic task (default 10 s, up to 60 s) to re‑query in case UDP packets are lost.

2.x Subscription

Uses the gRPC long connection to push updates; the client still caches instances and can optionally enable a periodic comparison task (disabled by default).

Data Consistency

Nacos clusters use a responsibility mechanism where each node manages a subset of instances for heartbeat and health‑check duties, while still holding the full registry.

The system balances consistency and availability using both CAP and BASE principles. Nacos supports both AP (using Alibaba’s Distro protocol) and CP (using Raft/JRaft).

AP Implementation

All nodes are peers; writes are replicated asynchronously with retry and periodic comparison mechanisms to achieve eventual consistency.

CP Implementation

Based on Raft: a leader handles writes, replicates to followers, and commits only after a majority acknowledge. Nacos 2.x uses the JRaft framework.

Data Model

A service is uniquely identified by three fields:

Namespace (default public)

Group (default DEFAULT_GROUP)

ServiceName

Instances also belong to a cluster (default DEFAULT), configurable via spring.cloud.nacos.discovery.cluster-name.

spring
  cloud:
    nacos:
      discovery:
        cluster-name: sanyoujavaCluster

Conclusion

The article walks through Nacos’s core registration, heartbeat, health‑check, discovery, consistency, and data‑model mechanisms across 1.x and 2.x, providing a comprehensive understanding for developers building micro‑service architectures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

gRPC Nacos Consistency Heartbeat service registry health check

Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.