Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
Nov 28, 2022 · Backend Development

Re‑engineering a Scalable Service Health‑Check System for Cloud‑Native Ops

This article details the redesign of a service health‑check component, covering its original limitations, industry alternatives, the chosen centralized active checking approach, architectural modules, concurrency model, scaling mechanisms, gray‑release strategy, and performance optimizations for reliable distributed systems.

backend architecturego concurrencyoperational reliability
0 likes · 17 min read
Re‑engineering a Scalable Service Health‑Check System for Cloud‑Native Ops
Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
Jan 13, 2022 · Operations

Detecting and Recovering Unhealthy Nodes in Microservice Architectures

This article explores various service health‑checking techniques in microservice environments, detailing how consumers, providers, and registration centers can identify unhealthy nodes through passive and active checks, heartbeat mechanisms, TCP connection monitoring, and registration‑center probing, while weighing trade‑offs in reliability, timeliness, and resource consumption.

Heartbeatnode monitoringservice discovery
0 likes · 11 min read
Detecting and Recovering Unhealthy Nodes in Microservice Architectures