Mastering Graceful Shutdown: Zero‑Downtime Deployments for Java Microservices
This article explains how to achieve zero‑downtime releases for Java applications by using JVM shutdown hooks, Spring context events, and service‑registry tricks for both monolithic and microservice architectures, with concrete code samples, configuration details, and Kubernetes probes to ensure seamless online upgrades.
Background
When a new version of a Java service is deployed, the old instance must be stopped and the new one started. Without coordination, in‑flight requests are aborted and the service becomes unavailable. The goal of a graceful release is to keep the service continuously reachable – also called zero‑loss release, delayed exposure, or warm‑up.
Terminology
Graceful online – expose the service only after it is fully ready (all health checks pass, dependencies are available).
Graceful offline – after receiving a termination signal (e.g., kill -15 pid), first deregister from the discovery registry, reject new traffic, then finish processing the remaining work before the JVM exits.
Monolithic implementation
JVM‑level shutdown hook
Register a hook with Runtime.getRuntime().addShutdownHook(Thread). The hook can:
Delay shutdown to let other tasks finish.
Close database/connection pools.
Delete temporary files.
executorService.shutdown(); // stop accepting new tasks
executorService.awaitTermination(1500, TimeUnit.SECONDS); // wait up to 1500 s Thread shutdownHook = new Thread(() -> {
System.out.println("Graceful shutdown executing");
});
Runtime.getRuntime().addShutdownHook(shutdownHook);Spring shutdown sequence
Spring relies on the JVM hook and then executes AbstractApplicationContext.doClose(). The sequence (simplified) is:
// publish ContextClosedEvent
publishEvent(new ContextClosedEvent(this)); // ①
// invoke Lifecycle beans
if (this.lifecycleProcessor != null) {
this.lifecycleProcessor.onClose(); // ②
}
// destroy singleton beans
destroyBeans(); // ③
// close bean factory
closeBeanFactory(); // ④
// subclass hook
onClose(); // ⑤Listening to ContextClosedEvent or adding a custom shutdown hook allows injection of additional cleanup logic.
Spring Boot (Tomcat) options
Method 1 – Actuator shutdown endpoint
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency> management.endpoint.shutdown.enabled=true
management.endpoints.web.exposure.include=shutdownPOST to /actuator/shutdown triggers the same Spring shutdown sequence. This endpoint should be protected (e.g., HTTP basic auth) because it can be invoked remotely.
Method 2 – Built‑in graceful shutdown (Spring Boot 2.3+)
# enable graceful shutdown (default is immediate)
server.shutdown=graceful
# maximum wait per shutdown phase (default 30 s)
spring.lifecycle.timeout-per-shutdown-phase=60sSpring stops accepting new requests, pauses the Tomcat connector, and waits for the thread pool to finish.
Method 3 – Custom TomcatConnectorCustomizer (pre‑2.3)
@Component
public class ShutdownConnectorCustomizer implements TomcatConnectorCustomizer, ApplicationListener<ContextClosedEvent> {
private volatile Connector connector;
@Override
public void customize(Connector connector) { this.connector = connector; }
@Override
public void onApplicationEvent(ContextClosedEvent event) {
connector.pause();
Executor executor = connector.getProtocolHandler().getExecutor();
if (executor instanceof ThreadPoolExecutor) {
ThreadPoolExecutor pool = (ThreadPoolExecutor) executor;
pool.shutdown();
if (!pool.awaitTermination(30, TimeUnit.SECONDS)) {
log.warn("Tomcat thread pool did not shut down gracefully within 30 seconds.");
}
}
}
}Microservice implementation
In a microservice mesh, each instance participates in inter‑service calls. Graceful offline must therefore:
Deregister the instance from the discovery registry before the JVM exits.
Allow client‑side caches (e.g., Ribbon 30 s) to expire.
Give in‑flight requests a window to complete.
The essential pattern is: register → become healthy → serve traffic → deregister → wait → exit .
Eureka
Eureka registers an instance immediately, which can cause other services to call a not‑yet‑ready instance. Delayed registration is possible via configuration, but a known bug caps the delay at 30 s.
eureka:
client:
healthcheck.enabled: false
onDemandUpdateStatusChange: false
initial-instance-info-replication-interval-seconds: 90 # <em>intended delay, effective max 30 s</em>To control registration manually, implement SpringApplicationRunListener and register only after the Spring context is fully started.
@Override
public void running(ConfigurableApplicationContext context) {
String eurekaUrls = context.getEnvironment().getProperty("eureka.client.service-url.defaultZone");
// build RestTemplate client, set status UP, then register
// log success/failure
}Graceful offline is achieved by listening to ContextClosedEvent, invoking DiscoveryManager.getInstance().shutdownComponent(), and then sleeping long enough for client caches to expire (e.g., 120 s).
@Component
public class EurekaShutdownConfig implements ApplicationListener<ContextClosedEvent>, PriorityOrdered {
@Override
public void onApplicationEvent(ContextClosedEvent event) {
try {
log.info("eureka instance offline begin!");
DiscoveryManager.getInstance().shutdownComponent();
Thread.sleep(120_000); // 120 s cache window
log.info("eureka instance offline end!");
} catch (Throwable ignored) {}
}
@Override public int getOrder() { return 0; }
}Why the 30 s bug matters – Eureka’s heartbeat thread uses
clientConfig.getInitialInstanceInfoReplicationIntervalSeconds()as the registration delay, but the value is capped at 30 s. The bug persists because Eureka is no longer actively maintained.
Nacos
Nacos also supports delayed registration. The typical pattern is to disable auto‑registration, start the Spring Boot application, and expose a custom actuator endpoint that registers the instance only after the health endpoint reports UP.
# application.yml
spring.cloud.nacos.discovery.enabled=true
spring.cloud.nacos.discovery.register-enabled=false
spring.cloud.nacos.discovery.port=${server.port:80} @Endpoint(id = "registry")
public class RegistryEndpoint {
@ReadOperation
public String registry() {
HealthComponent health = healthEndpoint.health();
if (!Status.UP.equals(health.getStatus())) {
return "{\"status\":\"UNKNOWN\",\"groups\":[\"liveness\",\"readiness\"]}";
}
nacosServiceRegistry.register(registration);
return "{\"status\":\"UP\",\"groups\":[\"liveness\",\"readiness\"]}";
}
}Graceful offline mirrors the Eureka approach: deregister, then sleep (≈35 s) to allow Nacos clients (which poll every 10 s) to notice the removal.
@Component
public class NacosShutdownEvent implements ApplicationListener<ContextClosedEvent>, PriorityOrdered {
@Override
public void onApplicationEvent(ContextClosedEvent event) {
try {
log.info("nacos instance offline begin!");
NacosServiceRegistry registry = event.getApplicationContext().getBean(NacosServiceRegistry.class);
NacosRegistration reg = event.getApplicationContext().getBean(NacosRegistration.class);
registry.deregister(reg);
Thread.sleep(35_000); // 35 s cache window
log.info("nacos instance offline end!");
} catch (Throwable ignored) {}
}
@Override public int getOrder() { return 0; }
}Dubbo
Dubbo enables graceful shutdown by default via ShutdownHookListener. The wait time can be tuned:
dubbo.application.shutwait=30sKubernetes
Kubernetes probes (readiness, liveness, startup) control traffic routing and container restarts. A typical Spring Boot deployment uses the actuator health endpoints as probe targets.
# Spring Boot actuator configuration
management.endpoints.web.exposure.include=health
management.endpoint.health.probes.enabled=true
# Deployment snippet (YAML)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-service
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
template:
spec:
containers:
- name: my-service
image: my-service:1.0.0
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 5
periodSeconds: 2
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 120
periodSeconds: 2
lifecycle:
preStop:
exec:
command: ["/bin/bash", "-c", "kill -15 1"]
terminationGracePeriodSeconds: 30The preStop hook sends SIGTERM to PID 1, allowing Spring’s shutdown hook to run before the container is finally killed.
Other resources
Thread‑pool graceful shutdown
Two APIs: shutdown() – stop accepting new tasks, wait for running tasks. shutdownNow() – attempt to interrupt running tasks and return pending tasks.
Both should be followed by awaitTermination to guarantee termination.
@PreDestroy
public void destroyThreadPool() {
if (waitForTasksToCompleteOnShutdown) {
executor.shutdown();
} else {
executor.shutdownNow();
}
try {
executor.awaitTermination(30, TimeUnit.SECONDS);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}Spring‑managed ThreadPoolTaskExecutor can be configured similarly:
@Bean("taskExecutor")
public ThreadPoolTaskExecutor taskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(10);
executor.setQueueCapacity(200);
executor.setKeepAliveSeconds(1000);
executor.setThreadNamePrefix("task-asyn");
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.AbortPolicy());
executor.setWaitForTasksToCompleteOnShutdown(true);
executor.setAwaitTerminationSeconds(60);
return executor;
}Message‑queue graceful shutdown
Spring‑managed MQ components (RabbitMQ, Kafka, etc.) already close connections and wait for in‑flight messages during the Spring shutdown sequence.
Scheduled task graceful shutdown
For @Scheduled tasks, assign them to a dedicated thread pool (as above) so the pool can be shut down gracefully. For external schedulers such as XXL‑JOB, use a gray‑release strategy: register half of the executors, wait for their tasks to finish, then switch the other half, avoiding interruption.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
