Cloud Native 13 min read

Zero‑Downtime Deployment with K8s and SpringBoot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Integration, and Config Separation

This article demonstrates how to achieve zero‑downtime releases for SpringBoot applications on Kubernetes by configuring readiness/liveness probes, rolling update strategies, graceful shutdown hooks, horizontal pod autoscaling, Prometheus monitoring, and externalized configuration via ConfigMaps.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
Zero‑Downtime Deployment with K8s and SpringBoot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Integration, and Config Separation

Preface

K8s + SpringBoot enables zero‑downtime publishing through health checks, rolling updates, graceful shutdown, autoscaling, Prometheus monitoring, and configuration separation.

Configuration

Health Check

Define readiness and liveness probes (exec, tcpSocket, httpGet) and expose the endpoints /actuator/health/readiness and /actuator/health/liveness on a dedicated management port (50000).

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
management:
  server:
    port: 50000  # enable independent management port
  endpoint:
    health:
      probes:
        enabled: true
  endpoints:
    web:
      exposure:
        base-path: /actuator
        include: health

Access URLs:

http://127.0.0.1:50000/actuator/health/readiness
http://127.0.0.1:50000/actuator/health/liveness

Rolling Update

Use the RollingUpdate strategy in the Deployment spec to ensure zero‑downtime while updating pods, with maxSurge and maxUnavailable controls.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {APP_NAME}
spec:
  replicas: {REPLICAS}
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1

Graceful Shutdown

Enable SpringBoot graceful shutdown and expose the /actuator/shutdown endpoint; invoke it via curl in a pre‑stop lifecycle hook.

spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s
server:
  shutdown: graceful
management:
  endpoint:
    shutdown:
      enabled: true
curl -X POST 127.0.0.1:50000/actuator/shutdown
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: {APP_NAME}
        lifecycle:
          preStop:
            exec:
              command: ["curl", "-XPOST", "127.0.0.1:50000/actuator/shutdown"]

Autoscaling

Set resource limits/requests for containers and create a HorizontalPodAutoscaler to scale based on CPU utilization.

resources:
  limits:
    cpu: 0.5
    memory: 1Gi
  requests:
    cpu: 0.15
    memory: 300Mi
---
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
  name: {APP_NAME}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {APP_NAME}
  minReplicas: {REPLICAS}
  maxReplicas: 6
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Prometheus Integration

Add Micrometer Prometheus registry and expose /actuator/prometheus and /actuator/metric endpoints.

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
management:
  metrics:
    tags:
      application: ${spring.application.name}
  endpoints:
    web:
      exposure:
        base-path: /actuator
        include: metrics,prometheus
http://127.0.0.1:50000/actuator/metric
http://127.0.0.1:50000/actuator/prometheus

Configuration Separation

Externalize configuration using a ConfigMap mounted into the container and activate profiles via environment variables.

# Generate ConfigMap from external yaml
kubectl create cm -n
--from-file=application-test.yaml --dry-run=1 -o yaml > configmap.yaml
kubectl apply -f configmap.yaml
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      env:
      - name: SPRING_PROFILES_ACTIVE
        value: test
      volumeMounts:
      - name: conf
        mountPath: "/app/config"
        readOnly: true
      volumes:
      - name: conf
        configMap:
          name: {APP_NAME}

Summary Configuration

Combined pom.xml dependencies, application.yaml settings, Dockerfile with curl, and a comprehensive Deployment manifest that includes probes, resources, autoscaling, Prometheus annotations, and ConfigMap mounting.

FROM openjdk:8-jdk-alpine
ARG JAR_FILE
ARG WORK_PATH="/app"
ARG EXPOSE_PORT=8080
ENV JAVA_OPTS=""
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' > /etc/timezone
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories && apk add --no-cache curl
COPY target/$JAR_FILE $WORK_PATH/
WORKDIR $WORK_PATH
EXPOSE $EXPOSE_PORT
ENTRYPOINT exec java $JAVA_OPTS -jar $JAR_FILE
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {APP_NAME}
spec:
  selector:
    matchLabels:
      app: {APP_NAME}
  replicas: {REPLICAS}
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: {APP_NAME}
      annotations:
        prometheus.io/port: "50000"
        prometheus.io/path: /actuator/prometheus
        prometheus.io/scrape: "true"
    spec:
      containers:
      - name: {APP_NAME}
        image: {IMAGE_URL}
        ports:
        - containerPort: {APP_PORT}
        - name: management-port
          containerPort: 50000
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: management-port
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 9
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: management-port
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 6
        resources:
          limits:
            cpu: 0.5
            memory: 1Gi
          requests:
            cpu: 0.1
            memory: 200Mi
        env:
        - name: TZ
          value: Asia/Shanghai
---
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
  name: {APP_NAME}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {APP_NAME}
  minReplicas: {REPLICAS}
  maxReplicas: 6
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
KubernetesautoscalingPrometheusSpringBootConfigMapHealthCheckRollingUpdateZeroDowntime
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.