How Nacos Implements Powerful Service Registration and Discovery
This article provides a detailed, step‑by‑step analysis of Nacos’s internal architecture and code, explaining how it manages service registration, health monitoring, dynamic configuration, and service discovery within Spring Cloud micro‑service environments.
1. Introduction to Nacos
Micro‑service architectures require a centralized registry so that consumers can locate provider instances without manually maintaining address lists. Nacos combines a service registry and a configuration center to address these needs, offering DNS/HTTP‑based discovery, health checks, dynamic configuration, weighted routing, and metadata management.
2. Nacos Architecture
The system consists of the following modules:
Provider APP : the service provider.
Consumer APP : the service consumer.
Name Server : routes requests using Virtual IP or DNS to achieve high availability.
Nacos Server : exposes OpenAPI, Config Service and Naming Service.
Consistency Protocol : Raft algorithm for data synchronization across cluster nodes.
Nacos Console : web UI for management.
3. Core Principles of the Registry
When a service instance starts, it registers itself; when it shuts down, it deregisters. Consumers query the registry to obtain healthy instances, and the registry performs health‑check calls to verify availability.
3.1 Registration Flow
Spring Cloud integrates Nacos via the spring-cloud-starter-alibaba-nacos-discovery starter, which pulls in spring-cloud-commons. This package defines the ServiceRegistry interface that all registration implementations must satisfy.
The concrete class for Nacos is NacosServiceRegistry. Its lifecycle is driven by the auto‑configuration class AutoServiceRegistrationAutoConfiguration, which registers an AutoServiceRegistration bean. The abstract class AbstractAutoServiceRegistration implements ApplicationListener<WebServerInitializedEvent>, so when the embedded web server starts, onApplicationEvent() triggers the registration logic.
public interface ServiceRegistry<R extends Registration> {
void register(R registration);
void deregister(R registration);
void close();
void setStatus(R registration, String status);
<T> T getStatus(R registration);
}During registration, NacosServiceRegistry.register() delegates to the Nacos client SDK’s NamingService.registerInstance() method.
public void register(Registration registration) {
if (StringUtils.isEmpty(registration.getServiceId())) {
log.warn("No service to register for nacos client...");
} else {
String serviceId = registration.getServiceId();
String group = this.nacosDiscoveryProperties.getGroup();
Instance instance = this.getNacosInstanceFromRegistration(registration);
try {
this.namingService.registerInstance(serviceId, group, instance);
log.info("nacos registry, {} {} {}:{} register finished", group, serviceId, instance.getIp(), instance.getPort());
} catch (Exception e) {
log.error("nacos registry, {} register failed...{},", serviceId, registration, e);
ReflectionUtils.rethrowRuntimeException(e);
}
}
}The client SDK creates a BeatInfo object for health monitoring and schedules periodic heartbeats via executorService.schedule(). If the heartbeat succeeds, registerInstance() finally calls serverProxy.registerService() to persist the instance information through an OpenAPI POST request.
public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
if (instance.isEphemeral()) {
BeatInfo beatInfo = new BeatInfo();
beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName));
beatInfo.setIp(instance.getIp());
beatInfo.setPort(instance.getPort());
beatInfo.setCluster(instance.getClusterName());
beatInfo.setWeight(instance.getWeight());
beatInfo.setMetadata(instance.getMetadata());
beatInfo.setPeriod(instance.getInstanceHeartBeatInterval() == 0L ? DEFAULT_HEART_BEAT_INTERVAL : instance.getInstanceHeartBeatInterval());
this.beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo);
}
this.serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance);
}3.2 Debugging Registration (Case 1)
After the Spring context starts, AbstractAutoServiceRegistration.onApplicationEvent() is invoked. NacosAutoServiceRegistration (a subclass) calls super.register(), which eventually reaches NacosServiceRegistry.register().
The registration method passes three parameters to the SDK: serviceId, group, and the Instance object. registerInstance() first adds a heartbeat via addBeatInfo(), then performs the actual registration with registerService().
3.3 Frequently Asked Questions
Q1: Why does Nacos registration depend on spring-cloud-commons ?
A: The starter pulls in spring-cloud-commons, which defines ServiceRegistry. Nacos implements this interface via NacosServiceRegistry.
Q2: My service does not register even though the dependencies are present.
A: Registration is triggered by a WebServerInitializedEvent, so the application must be a web project (e.g., include spring-boot-starter-web).
Q3: What else does spring-cloud-commons provide?
A: It contains spring.factories, which auto‑configures AutoServiceRegistrationAutoConfiguration. This class injects AutoServiceRegistration, whose concrete implementation NacosAutoServiceRegistration listens for the web server startup and initiates registration.
4. Service Discovery
Discovery occurs when a consumer makes a remote call (e.g., via OpenFeign). The call invokes NacosServerList.getServers(), which delegates to NacosNamingService.selectInstances() to retrieve a list of healthy Instance objects.
public class NacosServerList extends AbstractServerList<NacosServer> {
private List<NacosServer> getServers() {
try {
String group = this.discoveryProperties.getGroup();
List<Instance> instances = this.discoveryProperties.namingServiceInstance()
.selectInstances(this.serviceId, group, true);
return this.instancesToServerList(instances);
} catch (Exception e) {
throw new IllegalStateException("Can not get service instances from nacos, serviceId=" + this.serviceId, e);
}
}
} NacosNamingService.selectInstances()ultimately calls HostReactor.getServiceInfo(), which maintains three concurrent maps: serviceInfoMap: cached service information. updatingMap: flags for services currently being refreshed. futureMap: scheduled futures for periodic updates.
public ServiceInfo getServiceInfo(String serviceName, String clusters) {
String key = ServiceInfo.getKey(serviceName, clusters);
if (this.failoverReactor.isFailoverSwitch()) {
return this.failoverReactor.getService(key);
} else {
ServiceInfo serviceObj = this.getServiceInfo0(serviceName, clusters);
if (serviceObj == null) {
serviceObj = new ServiceInfo(serviceName, clusters);
this.serviceInfoMap.put(serviceObj.getKey(), serviceObj);
this.updatingMap.put(serviceName, new Object());
this.updateServiceNow(serviceName, clusters);
this.updatingMap.remove(serviceName);
} else if (this.updatingMap.containsKey(serviceName)) {
synchronized (serviceObj) {
try { serviceObj.wait(5000L); } catch (InterruptedException e) { LogUtils.NAMING_LOGGER.error(..., e); }
}
}
this.scheduleUpdateIfAbsent(serviceName, clusters);
return this.serviceInfoMap.get(serviceObj.getKey());
}
}The method scheduleUpdateIfAbsent() creates a periodic UpdateTask (default every 10 seconds) that refreshes the local cache asynchronously.
public void scheduleUpdateIfAbsent(String serviceName, String clusters) {
if (this.futureMap.get(ServiceInfo.getKey(serviceName, clusters)) == null) {
synchronized (this.futureMap) {
if (this.futureMap.get(ServiceInfo.getKey(serviceName, clusters)) == null) {
ScheduledFuture<?> future = this.addTask(new HostReactor.UpdateTask(serviceName, clusters));
this.futureMap.put(ServiceInfo.getKey(serviceName, clusters), future);
}
}
}
}4.1 Debugging Discovery (Case 2)
A Feign client annotated with @FeignClient("gulimall-member") triggers NacosServerList.getServers() using the service name. NacosNamingService.selectInstances() is called with subscribe=true, so the result is read from the local serviceInfoMap.
If the cache is missing, getServiceInfo() creates a new ServiceInfo, immediately fetches data via updateServiceNow(), and registers an asynchronous update task.
The periodic task keeps the cache fresh, ensuring that a failed instance does not immediately break calls.
5. Summary of Discovery Process
Remote call → NacosServerList.getServers() → NacosNamingService.selectInstances().
Depending on the subscribe flag, instances are obtained from the local cache or directly from the Nacos server.
The cache is maintained by HostReactor using three maps and periodic update tasks.
Health checks are performed via heartbeats; only healthy instances are returned to the consumer.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Shepherd Advanced Notes
Dedicated to sharing advanced Java technical insights, daily work snippets, and the power of persistent effort.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
