Evolution and Architecture of Ctrip's Service Registry (Artemis)
This article reviews the seven‑year evolution of Ctrip's microservice service‑registry from manual data maintenance through an etcd‑based solution to the self‑developed Artemis system, detailing its architecture, consistent‑hash data partitioning, high‑availability design, and second‑level instance up/down mechanisms.
Ctrip's microservice framework has been developed for over seven years, during which its service‑registry component has undergone three major iterations, culminating in the latest self‑developed Artemis architecture.
A service registry is a core component of microservice architectures, providing service discovery and enabling load balancing by allowing service providers to register their addresses and service consumers to query them.
In the initial manual‑maintenance phase, service URLs were submitted to a registry and synchronized to consumers, with load‑balancer configurations handled manually; this approach was simple but suffered from configuration complexity, single‑point failures, and performance overhead.
The second phase introduced an etcd‑based registry, where clients report their IP‑based instances to etcd via a Session layer; although etcd offered strong consistency, it introduced availability and performance challenges due to leader‑centric writes and TTL‑based health checks.
To overcome these limitations, Ctrip built Artemis, inspired by Netflix Eureka, storing registration data in memory across peer nodes, supporting eventual consistency, and adding push‑based change notifications for second‑level instance up/down.
Artemis consists of four roles: Client (SDK API), Session (stateless request router), Data (sharded, replicated storage using consistent hashing), and MetaServer (K8s‑driven address discovery). The Data layer uses a consistent‑hash ring with virtual nodes and multi‑replica storage to achieve scalable, balanced data distribution.
For massive data handling, Artemis employs consistent hashing with virtual nodes to evenly spread load and uses replica placement to ensure high availability; each service registration is stored on multiple nodes, with a primary and several followers.
Instance lifecycle management is achieved via WebSocket‑based notifications: clients establish long‑lived connections to Session, subscribe to services, and receive real‑time up/down events, enabling near‑instant routing updates.
The article concludes that Artemis successfully addresses the evolution challenges of service discovery, but Ctrip is now moving toward a ServiceMesh built on Kubernetes, aiming to externalize discovery, load balancing, and other cross‑cutting concerns from SDKs to sidecar proxies.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.