TonyBai
Mar 20, 2026 · Cloud Native
When a Server Silently Crashes, How Long Can Your Cluster Survive? Inside the Heartbeat Failover Mechanism
The article explains how distributed systems detect silently dead nodes using heartbeat mechanisms—both push and pull models—covers trade‑offs between interval and timeout, introduces advanced detectors like Cassandra's Φ, gossip protocols, and quorum rules, and shows real‑world implementations in Kubernetes and etcd.
CassandraDistributed SystemsKubernetes
0 likes · 12 min read
