Why Misconfiguring Nacos Ephemeral Settings Can Crash Your Payment Service

A misconfigured Nacos registration type turned a temporary service instance into a persistent one, causing heartbeat blockage and a cascading failure in the payment chain, illustrating when to use temporary versus persistent instances in service registries and configuration centers.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Why Misconfiguring Nacos Ephemeral Settings Can Crash Your Payment Service

Hello everyone, I'm Su San.

Before the holiday release, an issue occurred: after a gray release, some users reported that order status didn't update after payment, and the failure rate of the payment service skyrocketed.

Investigation revealed a fatal configuration error: during deployment, the payment-service was changed to use a persistent Nacos registration ( ephemeral=false).

One service node suffered a memory leak, causing GC pauses over 30 seconds; because the instance was persistent, Nacos didn't evict it, so callers kept sending requests to the faulty node, eventually collapsing the entire payment chain.

Fundamental Difference Between Service Registry and Configuration Center

We use Nacos for both service registry and configuration center, but they have different design goals: the service registry prioritizes high availability (AP) for service discovery, tolerating brief inconsistencies, while the configuration center requires consistency (CP), ensuring configurations are never lost and updates are synchronized.

In short, a registry instance is a live service node, whereas a configuration instance is a static configuration file.

Service Registry: Default Temporary Instances

The core requirement of a service registry is real‑time awareness of service availability.

Nacos provides two modes: temporary instances and persistent instances, matching dynamic services and static services respectively.

Temporary Instance

Temporary instances are Nacos's default mode.

When Spring Cloud, Dubbo, etc., start, they register as temporary instances unless configured otherwise. The heartbeat mechanism sends a ping every 5 seconds; if the server misses a heartbeat for 15 seconds it marks the instance unhealthy, and after 30 seconds it removes the instance.

Heartbeat: client sends a heartbeat every 5 seconds; server removes instance after 30 seconds of silence.

Storage: instance info lives only in server memory, not persisted to disk; a server restart clears all temporary instances.

Failure behavior: if a node crashes or its heartbeat is blocked (e.g., GC pause), the instance is automatically removed, preventing routing to an invalid node.

Persistent Instance

Persistent instances are the opposite: they target long‑running, rarely changing foundational services such as MySQL, Redis, Elasticsearch. The server actively probes health (TCP port, HTTP endpoint, or custom protocol) and persists instance data to Nacos's database.

Health check: server actively probes (e.g., MySQL 3306, Redis /health) instead of relying on client heartbeats.

Storage: instance information is persisted to the Nacos database (Derby by default, MySQL in production), surviving server restarts.

Failure behavior: when a node becomes unhealthy, Nacos marks it as unhealthy but does not delete it, allowing operators to see the faulty node in the console and recover it.

In a Spring Cloud project, switching the instance type only requires adding one line to application.yml. The following snippet shows the erroneous configuration that caused the outage:

spring:
  cloud:
    nacos:
      discovery:
        server-addr: 192.168.1.100:8848
        ephemeral: false  # should be true (default)
      service: payment-service  # registered service name

Configuration Center: Default Persistent

All configuration instances in Nacos's configuration center are persistent; there is no concept of temporary configuration. The center's purpose is centralized management to avoid loss, so every config is stored in a database (MySQL in production) and survives server restarts.

Storage: configs are persisted to the database, never lost on restart.

Lifecycle: configs are only removed or overwritten manually, not automatically when a client disconnects.

Dynamic updates: clients poll for changes (default every 30 seconds) and receive updates within a second, but this is about real‑time content change, not temporary existence.

Conclusion

Service Registry: use temporary instances for dynamic business services (payment, order) and persistent instances for static foundational components (MySQL, Redis).

Configuration Center: all configs are persistent; dynamic updates do not imply temporary existence.

Understanding these distinctions helps prevent incidents like the payment‑service failure described above.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

service discoveryNacosSpring CloudConfiguration Centerpersistentephemeral
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.