Tagged articles
15 articles
Page 1 of 1
Open Source Tech Hub
Open Source Tech Hub
Nov 13, 2025 · Fundamentals

Why Heartbeat Mechanisms Are Critical for Distributed System Reliability

This article explains how periodic heartbeat messages enable distributed systems to detect node failures, choose appropriate intervals and timeouts, compare push and pull models, employ advanced detection algorithms like phi and gossip, and apply these concepts in real-world platforms such as Kubernetes, Cassandra, and etcd.

Distributed SystemsFailure DetectionGossip Protocol
0 likes · 22 min read
Why Heartbeat Mechanisms Are Critical for Distributed System Reliability
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 27, 2025 · Databases

Understanding the Redis Cluster Bus and Node Communication

This article explains how Redis cluster nodes use a TCP-based gossip protocol and a dedicated cluster bus to discover each other, detect failures, synchronize configuration, and automatically redirect client commands to the correct node, illustrated with practical command‑line examples.

ClusterGossip ProtocolNode Communication
0 likes · 5 min read
Understanding the Redis Cluster Bus and Node Communication
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Sep 21, 2024 · Databases

Root Cause Analysis of a Redis Cluster Slot‑Migration Failure and Gossip‑Protocol Inconsistencies

This article analyzes a Redis cluster outage caused by a slot‑migration bug where a node simultaneously migrated slots in and out, leading to conflicting config epochs, gossip‑protocol mismatches, and MOVED errors, and provides detailed troubleshooting steps and preventive measures.

ClusterConfig EpochGossip Protocol
0 likes · 15 min read
Root Cause Analysis of a Redis Cluster Slot‑Migration Failure and Gossip‑Protocol Inconsistencies
ITPUB
ITPUB
Mar 8, 2023 · Databases

Mastering Redis Cluster: Deep Dive into Sharding, Failover, and Scaling

This article provides a comprehensive guide to Redis Cluster, covering its sharding mechanism, hash slot mapping, replication and automatic failover, client data location, slot reassignment, MOVED/ASK redirection, communication overhead, and practical tuning tips for large‑scale deployments.

ClusterGossip ProtocolReplication
0 likes · 20 min read
Mastering Redis Cluster: Deep Dive into Sharding, Failover, and Scaling
ITPUB
ITPUB
Feb 13, 2023 · Fundamentals

How a Bat-Borne Virus Explains the Gossip Protocol in Distributed Systems

Using a fictional coronavirus carried by a bat, the article illustrates the Gossip protocol’s mechanisms—direct mail, anti-entropy, and epidemic spread—to explain how distributed systems achieve eventual consistency, highlighting advantages, drawbacks, and practical considerations for storage components like Cassandra.

Anti-entropyDistributed SystemsGossip Protocol
0 likes · 10 min read
How a Bat-Borne Virus Explains the Gossip Protocol in Distributed Systems
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 24, 2021 · Databases

Deep Dive into Redis Cluster Architecture and Principles

This article provides a comprehensive analysis of Redis Cluster, covering node and slot assignment, command execution, resharding, redirection, fault‑tolerance, gossip communication, scaling strategies, configuration limits, and practical code examples for building and operating a high‑availability sharded Redis deployment.

ClusterGossip Protocolfailover
0 likes · 21 min read
Deep Dive into Redis Cluster Architecture and Principles
Wukong Talks Architecture
Wukong Talks Architecture
Feb 24, 2021 · Fundamentals

Understanding the Gossip Protocol Through a Virus Analogy

The article uses a whimsical story of a coronavirus‑like virus transmitted from a bat to humans to illustrate the Gossip protocol, its three functions—direct mail, anti‑entropy, and epidemic spread—and discusses their advantages, drawbacks, and practical applications in achieving eventual consistency in distributed systems.

Anti-entropyDistributed SystemsGossip Protocol
0 likes · 10 min read
Understanding the Gossip Protocol Through a Virus Analogy
DataFunTalk
DataFunTalk
Dec 30, 2019 · Databases

Cassandra: Past, Present, and Future – History, Architecture, Features, and Use Cases

This article summarizes a Cassandra meetup presentation that traces the database's origins from BigTable and Dynamo, outlines its key milestones, explains its peer‑to‑peer and LSM architecture, highlights current features, real‑world deployments, performance advantages, and previews upcoming 4.0 releases and community projects.

Big DataGossip ProtocolLSM
0 likes · 14 min read
Cassandra: Past, Present, and Future – History, Architecture, Features, and Use Cases
Java High-Performance Architecture
Java High-Performance Architecture
Dec 18, 2019 · Fundamentals

Understanding the CAP Theorem and Distributed Consistency: A Practical Guide

This article explains the CAP theorem and its trade-offs in distributed systems, compares consistency models like ZAB and Raft, discusses multi‑data‑center support, gossip protocols, watch mechanisms, multi‑language clients, DNS‑based service discovery, and health‑check strategies across tools such as Zookeeper, Consul, and Eureka.

Distributed SystemsGossip Protocol
0 likes · 7 min read
Understanding the CAP Theorem and Distributed Consistency: A Practical Guide
Programmer DD
Programmer DD
Nov 1, 2018 · Operations

Mastering Consul Service Discovery: Theory, Docker Deployment & Real‑World Tips

This article explains why service discovery is essential for microservice architectures, dives into Consul’s internal mechanisms—including multi‑datacenter gossip, Raft consensus, and client‑server roles—then provides step‑by‑step Docker deployment, service registration, health‑check configuration, and practical discovery methods via HTTP API and DNS.

ConsulDockerGossip Protocol
0 likes · 20 min read
Mastering Consul Service Discovery: Theory, Docker Deployment & Real‑World Tips
JD Retail Technology
JD Retail Technology
Jun 22, 2018 · Operations

JDOS Operations Platform: Managing Million‑Scale Container Clusters at JD.com

This article describes JD.com's JDOS Operations Platform, which enables two operators to manage millions of Docker and Kubernetes containers across massive clusters, detailing its architecture, regression analysis of scale, gossip‑based inspection system, intelligent alert convergence, and performance improvements for ultra‑large‑scale environments.

DockerGossip ProtocolKubernetes
0 likes · 11 min read
JDOS Operations Platform: Managing Million‑Scale Container Clusters at JD.com
JD Tech
JD Tech
Jun 22, 2018 · Operations

JDOS Operations Platform: Managing Millions of Containers at JD.com

The article describes how JD.com built and operates the JDOS Operations Platform to manage a multi‑million‑container Docker and Kubernetes fleet, detailing the challenges of massive scale, the architectural components such as the configuration center, operation center, inspection system, gossip‑based communication, and an intelligent alerting system that together enable efficient, automated, and reliable large‑scale container operations.

Container ManagementGossip ProtocolKubernetes
0 likes · 12 min read
JDOS Operations Platform: Managing Millions of Containers at JD.com
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Apr 24, 2018 · Databases

How Dynamo Achieves High‑Availability in Distributed Key‑Value Stores

This article explains Dynamo, the decentralized key‑value storage system, covering its design goals, consistent‑hashing partitioning with virtual nodes, replication strategies, quorum‑based consistency, conflict resolution with vector clocks, hinted handoff, Merkle‑tree synchronization, and gossip‑based failure detection.

DynamoGossip ProtocolReplication
0 likes · 9 min read
How Dynamo Achieves High‑Availability in Distributed Key‑Value Stores