How Eureka Ensures High Availability: Normal Operation and Failure Strategies
This article explains Eureka's normal workflow, how it registers services, replicates data, and maintains local caches, and details its fallback mechanisms, unavailable server handling, full‑outage resilience, and self‑preservation feature to keep microservice discovery reliable.
1. How Eureka Works Normally
When a service starts, the Eureka client registers with the Eureka server and fetches the latest service list.
Eureka servers replicate the latest data among themselves using a peer‑to‑peer mode.
The client sends periodic heartbeats to renew its lease, reports its status, and obtains updated service lists.
The client stores the fetched list in a localRegionApps variable.
2. Strategies for Eureka Failure Scenarios
2.1 Client Starts with No Available Server
If all Eureka servers are unreachable at startup, the client cannot register itself or obtain a service list, preventing interaction with other services.
Eureka provides a fallback registry via the property eureka.client.backup-registry-impl. When the server is unavailable, the client uses the backup registry to retrieve a service list, stores it in localRegionApps, and can continue normal interaction.
2.2 Some Servers Become Unavailable During Normal Operation
The client randomizes the order of server addresses to avoid hotspoting a single server.
It maintains a list of unavailable servers; when a server is detected as down, it is added to this list and excluded from future requests.
For example, with servers server1, server2, server3, if server3 fails, the client adds it to the unavailable list and stops contacting it.
2.3 All Servers Are Unavailable During Normal Operation
If the client previously fetched a service list, it keeps that list in localRegionApps. When all servers are down, the periodic fetch fails, but the client continues to use the cached list, allowing continued interaction with other services.
2.4 Client Fails to Renew Lease in Time
Eureka servers consider a client dead if it does not renew its lease, removing it from the registry.
Network glitches can cause missed renewals, leading to false positives.
To avoid erroneous removals, Eureka implements a Self Preservation mechanism. When the number of renewals in the last minute falls below a configurable threshold (default 15%), the server enters self‑preservation mode and stops expiring instances.
3. Summary
Eureka achieves high availability through several mechanisms:
Backup registry fallback to obtain a service list when servers are initially unavailable.
Maintenance of an unavailable‑server list to bypass failed servers.
Local caching of the service list ( localRegionApps) so the client can continue operating when all servers are down.
Self Preservation to prevent premature removal of instances during widespread lease‑renewal failures.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
