Mastering K8s Application Lifecycle: Health Checks, Graceful Shutdown, Metrics & Tracing
This article explains how developers and operators should prepare a Go‑based service for Kubernetes by implementing health‑check endpoints, graceful shutdown handling, metrics exposure, tracing integration, standardized logging, and operational best practices such as stateless design, high availability, self‑healing, and HTTPS configuration.
In the whole lifecycle of an application, development and operations are inseparable. When deploying to Kubernetes, both sides have responsibilities.
Development Side
From the development perspective, the application should provide the following capabilities:
Health check endpoint
Graceful shutdown
Metrics endpoint
Trace integration
Standardized log output
Define health check endpoint
The health check endpoint is used by Kubernetes readiness and liveness probes to determine if the pod is ready or alive. If not defined, Kubernetes cannot assess the application’s health.
Example implementation:
package router
import (
"github.com/gin-gonic/gin"
v1 "go-hello-world/app/http/controllers/v1"
)
func SetupRouter(router *gin.Engine) {
ruc := new(v1.RootController)
router.GET("/", ruc.Root)
huc := new(v1.HealthController)
router.GET("/health", huc.HealthCheck)
} package v1
import (
"github.com/gin-gonic/gin"
"go-hello-world/app/http/controllers"
"go-hello-world/pkg/response"
"net/http"
)
type HealthController struct {
controllers.BaseController
}
func (h *HealthController) HealthCheck(c *gin.Context) {
response.WriteResponse(c, http.StatusOK, nil, gin.H{"result": "健康检测页面", "status": "OK"})
}When the application starts, Kubernetes probes this endpoint; a successful response indicates the app is healthy. In real scenarios, the health check may also need to verify dependent services such as Redis, MySQL, MQ, etc.
The corresponding YAML snippet adds readinessProbe and livenessProbe:
readinessProbe:
httpGet:
path: /health
port: http
timeoutSeconds: 3
initialDelaySeconds: 20
livenessProbe:
httpGet:
path: /health
port: http
timeoutSeconds: 3
initialDelaySeconds: 30Define graceful shutdown
During a rolling update, the old pod must finish processing in‑flight requests before termination. Kubernetes sends a SIGTERM to the pod, waits for a grace period, then SIGKILL. The application should handle these signals.
Example shutdown library:
package shutdown
import (
"context"
"fmt"
"net/http"
"os"
"os/signal"
"time"
)
type Shutdown struct {
ch chan os.Signal
timeout time.Duration
}
func New(t time.Duration) *Shutdown {
return &Shutdown{ch: make(chan os.Signal), timeout: t}
}
func (s *Shutdown) Add(signals ...os.Signal) { signal.Notify(s.ch, signals...) }
func (s *Shutdown) Start(server *http.Server) {
<-s.ch
fmt.Println("start exit......")
ctx, cancel := context.WithTimeout(context.Background(), s.timeout*time.Second)
defer cancel()
if err := server.Shutdown(ctx); err != nil {
fmt.Println("Graceful exit failed. err:", err)
}
fmt.Println("Graceful exit success.")
}The main program registers the shutdown handler and starts the HTTP server, then calls shutdown.Start(server) after adding SIGINT and SIGTERM.
package main
import (
"github.com/gin-gonic/gin"
"go-hello-world/pkg/shutdown"
"go-hello-world/router"
"log"
"net/http"
"syscall"
"time"
)
func main() {
r := gin.New()
router.SetupRouter(r)
server := &http.Server{Addr: ":8080", Handler: r}
go func() {
if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("server.ListenAndServe err: %v", err)
}
}()
quit := shutdown.New(10)
quit.Add(syscall.SIGINT, syscall.SIGTERM)
quit.Start(server)
}Kubernetes also supports a PreStop hook; for example, a sleep command or a call to a service registry such as Nacos before the pod is removed.
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- sleep 30Define metrics endpoint
Metrics expose application statistics for Prometheus. The example uses the Prometheus client library to expose default HTTP metrics and custom counters/histograms.
package metrics
import (
"github.com/prometheus/client_golang/prometheus"
"net/http"
"time"
)
var (
HttpserverRequestTotal = prometheus.NewCounterVec(prometheus.CounterOpts{Name: "httpserver_request_total", Help: "The Total number of httpserver requests"}, []string{"method", "endpoint"})
HttpserverRequestDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{Name: "httpserver_request_duration_seconds", Help: "httpserver request duration distribution", Buckets: []float64{0.1,0.3,0.5,0.7,0.9,1}}, []string{"method", "endpoint"})
)
func init() {
prometheus.MustRegister(HttpserverRequestTotal)
prometheus.MustRegister(HttpserverRequestDuration)
}
func NewMetrics(handler http.HandlerFunc) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
handler(w, r)
duration := time.Since(start)
HttpserverRequestTotal.With(prometheus.Labels{"method": r.Method, "endpoint": r.URL.Path}).Inc()
HttpserverRequestDuration.With(prometheus.Labels{"method": r.Method, "endpoint": r.URL.Path}).Observe(duration.Seconds())
}
}The service’s Deployment template adds the Prometheus annotations so that the metrics endpoint is scraped automatically.
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "metrics"Define tracing
Tracing assigns a TraceID to each request, enabling end‑to‑end request tracking. Open‑source tracing systems include Jaeger, Zipkin, SkyWalking, etc. The article chooses SkyWalking for Go, showing a minimal integration.
package main
import (
"github.com/SkyAPM/go2sky"
v3 "github.com/SkyAPM/go2sky-plugins/gin/v3"
"github.com/SkyAPM/go2sky/reporter"
"github.com/gin-gonic/gin"
"github.com/prometheus/client_golang/prometheus/promhttp"
"go-hello-world/pkg/shutdown"
"go-hello-world/router"
"log"
"net/http"
"syscall"
"time"
)
var SKYWALKING_ENABLED = false
func main() {
r := gin.New()
if SKYWALKING_ENABLED {
rp, err := reporter.NewGRPCReporter("skywalking-oap:11800", reporter.WithCheckInterval(time.Second))
if err != nil {
log.Printf("create gosky reporter failed. err: %s", err)
}
defer rp.Close()
tracer, _ := go2sky.NewTracer("go-hello-world", go2sky.WithReporter(rp))
r.Use(v3.Middleware(r, tracer))
}
router.SetupRouter(r)
server := &http.Server{Addr: ":8080", Handler: r}
go func() {
http.Handle("/metrics", promhttp.Handler())
if err := http.ListenAndServe(":9527", nil); err != nil {
log.Printf("metrics port listen failed. err: %s", err)
}
}()
go func() {
if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("server.ListenAndServe err: %v", err)
}
}()
quit := shutdown.New(10)
quit.Add(syscall.SIGINT, syscall.SIGTERM)
quit.Start(server)
}Standardized logging
Consistent log output (preferably to stdout) simplifies collection and troubleshooting. In Kubernetes, logs should not be written to files because they are transient and may be lost during redeployments.
Operations Side
After development, operations deploy the application. To ensure stability, the following points should be considered:
Keep the application stateless
Ensure high availability
Provide graceful rollout capability
Support self‑healing
Expose HTTPS
Stateless design
Prefer stateless services; persist data in databases or external storage rather than inside the pod.
High availability
Deploy multiple replicas, use pod anti‑affinity, configure PodDisruptionBudget, and set appropriate QoS resources.
spec:
replicas: 2
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["httpserver"]
topologyKey: kubernetes.io/hostname
podDisruptionBudget:
minAvailable: 1
selector:
matchLabels:
app: httpserver
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: "1"
memory: 2GiGraceful rollout
Kubernetes adds a pod to the service only after the readiness probe succeeds, ensuring that traffic is sent to a fully started instance.
Self‑healing
Liveness probes detect crashed or unhealthy pods and trigger restarts; node failures cause pod rescheduling.
HTTPS access
Create a TLS secret and reference it in an Ingress resource to expose the service over HTTPS.
# Create TLS secret
kubectl create secret tls httpserver-tls-secret --cert=path/to/tls.cert --key=path/to/tls.key
# Ingress snippet
spec:
tls:
- hosts:
- httpserver.coolops.cn
secretName: httpserver-tls-secret
rules:
- host: httpserver.coolops.cn
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: httpserver
port:
number: 8080Conclusion
The article outlines essential development and operational practices for deploying a Go‑based HTTP service on Kubernetes, covering health checks, graceful shutdown, metrics, tracing, logging, stateless design, high availability, graceful rollout, self‑healing, and HTTPS exposure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
