How Nacos 2.0 Redesign Fixes the Pain Points of the 1.x Architecture
This article reviews Nacos' evolution from its 1.x architecture—highlighting its five-layer design and service discovery issues—to the 2.0 version that introduces long‑connection RPC, a new client‑centric model, and improved performance, while also outlining upcoming roadmap plans.
Nacos Overview
Nacos originated at Alibaba in 2008, initially supporting microservice splitting and a business middle‑platform. In 2018 it was open‑sourced to share a decade of service discovery and configuration management experience, aiming to accelerate digital transformation.
It supports major microservice languages and frameworks (e.g., Dubbo, SCA) and integrates cloud‑native components such as CoreDNS and Sentinel. Client languages include Java, Go, Python, and recently released C# and C++.
Nacos 1.x Architecture and Issues
1. Architecture Layers
Nacos 1.x consists of five layers: Access, Communication, Function, Sync, and Persistence.
Access Layer : Includes Nacos client, Dubbo/SCA integrations, and the Console UI. All service and configuration operations are sent via HTTP OpenAPI.
Communication Layer : Primarily short‑lived HTTP connections; some push features use UDP.
Function Layer : Provides service discovery and configuration management.
Sync Layer : Offers three sync mechanisms:
Distro – non‑persistent service sync.
Raft – persistent service sync and configuration sync when using Derby.
Notify – cache update notifications when MySQL stores configuration.
Persistence Layer : Uses MySQL, Derby, and local file system for storing service metadata, configuration, user, and permission data.
2. Service Model in 1.x
Service registration is performed via OpenAPI HTTP requests from the client (often through Dubbo or SCA). The server validates the request, creates or updates a Service object identified by namespace + group + service name, and stores instance information.
Two events are triggered:
Data sync using Distro or Raft, notifying other Nacos nodes.
Subscription notification via UDP, pushing the updated service list to subscribers. Persistent services are synchronized using Raft to ensure durability.
3. Problems in 1.x
High TPS caused by excessive heartbeats and invalid queries.
Long latency in detecting service changes due to 15 s heartbeat timeout.
Unreliable UDP pushes leading to frequent client‑side reconciliation queries.
Many TIME_WAIT connections from short‑lived HTTP requests, causing connection‑time‑out errors.
Configuration module’s 30‑second long‑polling induces frequent GC.
Nacos 2.0 Architecture and New Model
1. Architecture Layers
Nacos 2.x builds on 1.x but adds long‑connection support while retaining compatibility with old clients and OpenAPI. The communication layer now uses gRPC and Rsocket to provide persistent RPC calls and push capabilities. A new “link” layer normalizes different request types from various clients into a unified data structure, enabling future traffic control and load balancing.
2. New Service Model
All client requests (registration, subscription, etc.) share a single long‑lived connection, introducing a stateful Client object that aggregates all data associated with that connection. When a client publishes a service, the corresponding Client object is updated, triggering index updates and push events to subscribed clients via the same connection. Only the directly updated Client objects are synchronized; synchronized updates do not re‑trigger sync, reducing unnecessary traffic. Metadata (labels, instance status, weight, etc.) is separated from core data (IP, port, service name) to allow dynamic modification without affecting immutable base information.
3. Advantages of 2.0
Heartbeat elimination – a lightweight keep‑alive reduces TPS dramatically.
Fast detection of TCP disconnections improves responsiveness.
Reliable long‑connection streaming replaces UDP, lowering invalid QPS.
Eliminates TIME_WAIT overload.
Real long connections solve configuration module GC issues.
Finer‑grained sync reduces inter‑node communication pressure.
4. Drawbacks of 2.0
Increased internal complexity for connection management and load balancing.
Stateful data tied to connections lengthens processing chains.
Observability of RPC (gRPC) is less straightforward than plain HTTP.
Future Plans for Nacos 2.x
Improvements focus on documentation, code quality, and roadmap execution. Documentation will be expanded with detailed guides, e‑books, and GitHub‑hosted technical discussions. Code will undergo extensive refactoring, unit‑test and integration‑test enhancements, and benchmark open‑sourcing. The roadmap includes major refactoring toward a plugin architecture and addressing current 2.0 drawbacks such as load balancing and observability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Tech Hub
Sharing cutting-edge internet technologies and practical AI resources.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
