Tagged articles
153 articles
Page 1 of 2
Ops Community
Ops Community
May 9, 2026 · Operations

Achieve Seamless Nginx High Availability with Keepalived: A Practical Guide

This article walks through building a simple, cost‑effective high‑availability solution for Nginx using Keepalived’s VRRP‑based VIP failover, covering environment setup, configuration of master and backup nodes, health‑check scripts, testing procedures, troubleshooting tips, and rollback steps.

LinuxNGINXfailover
0 likes · 29 min read
Achieve Seamless Nginx High Availability with Keepalived: A Practical Guide
MaGe Linux Operations
MaGe Linux Operations
Nov 5, 2025 · Databases

Deploy Redis Sentinel for High Availability in 30 Minutes – Step‑by‑Step Guide

Learn how to set up Redis Sentinel for high‑availability caching, covering prerequisites, anti‑patterns, detailed configuration of master, replicas and Sentinel nodes, firewall rules, monitoring, failover testing, troubleshooting, performance tuning, backup, rollback and best practices—all achievable within a 30‑minute deployment.

LinuxReplicationfailover
0 likes · 38 min read
Deploy Redis Sentinel for High Availability in 30 Minutes – Step‑by‑Step Guide
MaGe Linux Operations
MaGe Linux Operations
Nov 2, 2025 · Databases

Zero Data Loss MySQL Master‑Slave Replication Lag Diagnosis & GTID Failover

This comprehensive guide explains how to diagnose MySQL master‑slave replication lag, enable GTID mode, configure semi‑synchronous replication, optimize multi‑threaded replication, set up monitoring and alerting with Prometheus, and perform zero‑data‑loss failover using tools like Orchestrator and custom scripts.

AutomationGTIDReplication
0 likes · 23 min read
Zero Data Loss MySQL Master‑Slave Replication Lag Diagnosis & GTID Failover
Tech Freedom Circle
Tech Freedom Circle
Oct 16, 2025 · Databases

Redis Crash Interview: How to Recover a Failed Node and Estimate Data Loss

This article walks through a systematic emergency response for Redis outages, explains how Redis Cluster promotes a replica, quantifies the typical data‑loss window from hundreds of milliseconds to several seconds, and provides detailed persistence configurations (RDB, AOF, and hybrid) to minimise downtime and data loss.

AOFClusterPersistence
0 likes · 35 min read
Redis Crash Interview: How to Recover a Failed Node and Estimate Data Loss
Ray's Galactic Tech
Ray's Galactic Tech
Sep 27, 2025 · Databases

Master PostgreSQL Streaming Replication: Step‑by‑Step Setup Guide

This comprehensive guide explains PostgreSQL streaming replication concepts, required environment, primary and standby configuration commands, verification queries, failover procedures, and production best‑practice recommendations, enabling you to build a reliable high‑availability database cluster.

Database ReplicationPostgreSQLStreaming Replication
0 likes · 7 min read
Master PostgreSQL Streaming Replication: Step‑by‑Step Setup Guide
Raymond Ops
Raymond Ops
Sep 8, 2025 · Operations

How to Set Up DRBD and Keepalived for Real‑Time File Sync and Failover

This guide walks you through installing and configuring DRBD and keepalived on two Linux nodes to achieve real‑time block‑level file synchronization, automatic primary/secondary role switching, and high‑availability failover for services such as PostgreSQL, including troubleshooting common issues like split‑brain and busy mounts.

DRBDLinuxfailover
0 likes · 13 min read
How to Set Up DRBD and Keepalived for Real‑Time File Sync and Failover
MaGe Linux Operations
MaGe Linux Operations
Sep 6, 2025 · Databases

How to Build a High‑Availability MySQL Master‑Slave Cluster and Automate Failover

This guide walks through the reasons for MySQL master‑slave replication, explains its core mechanisms, details step‑by‑step environment planning, configuration, data initialization, replication setup, monitoring, failover with MHA, read‑write splitting using ProxySQL, performance tuning, troubleshooting, and best‑practice recommendations for enterprise‑grade high availability.

Replicationfailoverhigh availability
0 likes · 27 min read
How to Build a High‑Availability MySQL Master‑Slave Cluster and Automate Failover
Raymond Ops
Raymond Ops
Aug 11, 2025 · Operations

Mastering Redis Sentinel: Automatic Failover and High Availability Explained

This article provides a comprehensive guide to Redis Sentinel, covering its purpose, architecture, monitoring functions, discovery mechanisms, failover process, leader election, configuration options, and practical commands for achieving reliable high‑availability in Redis deployments.

Operationsfailoverhigh availability
0 likes · 17 min read
Mastering Redis Sentinel: Automatic Failover and High Availability Explained
Su San Talks Tech
Su San Talks Tech
Jul 7, 2025 · Operations

Mastering High Availability: Redundancy & Automatic Failover in Modern Internet Architecture

This article explains how to achieve high availability in internet systems by designing redundant components and automatic failover mechanisms across layers such as load balancers, reverse proxies, microservices, middleware, databases, and messaging, illustrating concepts with diagrams of architectures, clustering, leader election, and practical tools like keepalived, Zookeeper, Redis Sentinel, and Kafka.

MicroservicesOperationsfailover
0 likes · 19 min read
Mastering High Availability: Redundancy & Automatic Failover in Modern Internet Architecture
php Courses
php Courses
May 26, 2025 · Backend Development

Implementing Load‑Balancer‑Like Auto‑Decision Logic in PHP Applications

This article explores how to embed load‑balancer concepts such as intelligent request distribution, health checks, automatic failover, and dynamic strategy adjustment directly into PHP applications using algorithms like weighted round‑robin, response‑time balancing, and circuit‑breaker patterns, providing code examples and practical deployment scenarios.

PHPfailoverhealth check
0 likes · 11 min read
Implementing Load‑Balancer‑Like Auto‑Decision Logic in PHP Applications
Ma Wei Says
Ma Wei Says
Apr 8, 2025 · Operations

Mastering High Availability: 4 Failover Patterns Explained

Understanding high‑availability architectures involves mastering replication and fail‑over, balancing RTO and RPO, and choosing among four patterns—Active‑Standby, Active‑Active, Cold Standby, and Hot Standby—each with distinct synchronization, load‑balancing, and cost considerations for reliable system design.

Active-ActiveReplicationactive standby
0 likes · 9 min read
Mastering High Availability: 4 Failover Patterns Explained
Sanyou's Java Diary
Sanyou's Java Diary
Feb 20, 2025 · Databases

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel provides a robust high‑availability solution by monitoring master‑slave clusters, automatically detecting failures, electing leaders, and performing failover, while using quorum voting, Pub/Sub communication, and configuration provisioning to ensure seamless master promotion and client redirection without manual intervention.

databasefailoverhigh availability
0 likes · 16 min read
How Redis Sentinel Ensures Automatic Failover and High Availability
MaGe Linux Operations
MaGe Linux Operations
Jan 27, 2025 · Operations

Redis Sentinel Deep Dive: High‑Availability Architecture & Automatic Failover

This article explains Redis Sentinel’s role as the official high‑availability solution, detailing its monitoring, notification, automatic failover mechanisms, discovery processes, connection types, down‑state classifications, failover steps, leader election, master selection rules, and data consistency guarantees.

Operationsfailoverhigh availability
0 likes · 18 min read
Redis Sentinel Deep Dive: High‑Availability Architecture & Automatic Failover
dbaplus Community
dbaplus Community
Jan 21, 2025 · Databases

How Bilibili Scaled Its Comment System with Multi‑Level Storage and Automatic Failover

Bilibili’s comment service, a critical component for user interaction, faces massive read‑write traffic that can overwhelm TiDB, so the team built a multi‑level storage architecture using Redis sorted‑sets for indexes and a custom Taishan KV store, adding automatic degradation, consistency mechanisms, and hedging policies to ensure high availability and performance.

Comment SystemData Consistencyfailover
0 likes · 12 min read
How Bilibili Scaled Its Comment System with Multi‑Level Storage and Automatic Failover
IT Architects Alliance
IT Architects Alliance
Jan 7, 2025 · Cloud Computing

Elastic Architecture: Auto Scaling and Failover for Resilient Systems

The article explains how elastic architecture, through auto‑scaling and failover mechanisms, dynamically adjusts resources and ensures continuous service during traffic spikes and component failures, improving cost efficiency, reliability, and operational stability for modern cloud‑based applications.

Auto ScalingElastic ArchitectureOperations
0 likes · 16 min read
Elastic Architecture: Auto Scaling and Failover for Resilient Systems
Aikesheng Open Source Community
Aikesheng Open Source Community
Jan 7, 2025 · Databases

Analysis of Redis Sentinel Failover Issue in Redis 7.4.0 and Resolution via Pub/Sub ACL Adjustment

This article investigates a Redis Sentinel failover anomaly in version 7.4.0 where the sentinel repeatedly elects a failed master, explains the underlying s_down/o_down states, examines network, configuration, and ACL settings, and resolves the issue by adjusting Pub/Sub permissions to allow proper failover.

ACLdatabasefailover
0 likes · 11 min read
Analysis of Redis Sentinel Failover Issue in Redis 7.4.0 and Resolution via Pub/Sub ACL Adjustment
Liangxu Linux
Liangxu Linux
Oct 1, 2024 · Operations

10 Proven Practices to Prevent System Failures for Ops Teams

This guide outlines ten practical strategies—including rollback testing, safe handling of destructive commands, prompt customization, robust backup and verification, production environment discipline, thorough handover, proactive monitoring, cautious auto‑failover, meticulous execution, and simplicity—to help operations engineers dramatically reduce system outages and improve reliability.

BackupOperationsbest practices
0 likes · 17 min read
10 Proven Practices to Prevent System Failures for Ops Teams
Open Source Linux
Open Source Linux
Sep 20, 2024 · Databases

Redis Master‑Slave Replication and Sentinel: How They Work and Scale

This article explains Redis master‑slave replication, synchronization steps, handling of network partitions, and how Sentinel provides automatic failover through monitoring, leader election, and notification, offering strategies to reduce master load and ensure high availability.

Master‑SlaveReplicationdatabase
0 likes · 9 min read
Redis Master‑Slave Replication and Sentinel: How They Work and Scale

Design and Implementation of MySQL High Availability Using Orchestrator and DBProxy

This article presents a comprehensive design and implementation for achieving MySQL high availability by replacing the single‑master architecture with Orchestrator‑driven automatic failover, integrating DBProxy for transparent routing, and addressing topology changes and data compensation to ensure continuous, reliable service.

DBProxyData CompensationDatabase Replication
0 likes · 16 min read
Design and Implementation of MySQL High Availability Using Orchestrator and DBProxy
ITPUB
ITPUB
Jun 15, 2024 · Databases

Resolving Oracle RAC VIP Failover and SCAN IP Load‑Balancing Issues

This article walks through real‑world Oracle RAC failures caused by misconfigured VIP failover and SCAN IP load‑balancing, explains how to diagnose the symptoms, provides correct TAF and listener settings, and highlights essential configuration tips to ensure reliable high‑availability operation.

Database ConfigurationOracleRAC
0 likes · 9 min read
Resolving Oracle RAC VIP Failover and SCAN IP Load‑Balancing Issues
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Apr 11, 2024 · Databases

Mastering Redis Sentinel: Ensuring Automatic High Availability

This article explains Redis Sentinel’s role in providing monitoring, notifications, automatic failover, and configuration updates to achieve high availability, detailing its heartbeat mechanism, master‑down detection, leader election, failover selection criteria, and the trade‑offs of using this solution.

databasefailoverhigh availability
0 likes · 6 min read
Mastering Redis Sentinel: Ensuring Automatic High Availability
Architecture & Thinking
Architecture & Thinking
Apr 10, 2024 · Operations

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel provides automatic monitoring, fault detection, and failover for Redis master‑slave clusters, enabling high availability by electing a new master when the original fails, using sdown/odown states, quorum voting, and pub/sub communication to keep services running with minimal downtime.

failoverhigh availabilitymonitoring
0 likes · 11 min read
How Redis Sentinel Ensures Automatic Failover and High Availability
JavaEdge
JavaEdge
Feb 23, 2024 · Databases

Inside Alibaba's Doris KV Store: Architecture, Routing & Failover Secrets

This article examines Alibaba's internal Doris KV storage system, detailing why large companies build proprietary data products, the project's kickoff criteria, the two‑layer architecture, virtual‑node routing, failover mechanisms, and cluster scaling strategies for massive KV workloads.

Database ArchitectureKV StoreRouting Algorithm
0 likes · 18 min read
Inside Alibaba's Doris KV Store: Architecture, Routing & Failover Secrets
Bilibili Tech
Bilibili Tech
Feb 20, 2024 · Backend Development

Investigation and Optimization of Unexpected AAAA DNS Requests in Go Applications

The article investigates why Go applications unexpectedly send AAAA DNS queries to a secondary nameserver, tracing the issue to the built‑in resolver’s handling of non‑recursive responses from a NetScaler proxy, and recommends using the cgo resolver, enabling recursion, or forcing IPv4 to eliminate the added latency.

DNSDebuggingGo
0 likes · 14 min read
Investigation and Optimization of Unexpected AAAA DNS Requests in Go Applications
dbaplus Community
dbaplus Community
Jun 12, 2023 · Databases

How a Redis Memory Upgrade Triggered Data Loss: Sentinel Failover Lessons

A recent Redis deployment faced memory expansion, a master‑slave switch, and unexpected data loss when the new master entered read‑only mode, prompting a deep dive into sentinel behavior, maxmemory settings, and replica‑ignore‑maxmemory nuances to prevent similar failures.

Memory Upgradefailoverhigh availability
0 likes · 12 min read
How a Redis Memory Upgrade Triggered Data Loss: Sentinel Failover Lessons
Top Architect
Top Architect
May 5, 2023 · Backend Development

Using Redis Sentinel for High Availability: Design and Implementation

This article introduces Redis Sentinel as the official high‑availability solution for Redis, explains its core functions, provides configuration examples, compares three ways to receive failover notifications (script, client subscription, and indirect service), and offers design recommendations for robust production deployments.

DevOpsfailoverhigh-availability
0 likes · 10 min read
Using Redis Sentinel for High Availability: Design and Implementation
ITPUB
ITPUB
Mar 8, 2023 · Databases

Mastering Redis Cluster: Deep Dive into Sharding, Failover, and Scaling

This article provides a comprehensive guide to Redis Cluster, covering its sharding mechanism, hash slot mapping, replication and automatic failover, client data location, slot reassignment, MOVED/ASK redirection, communication overhead, and practical tuning tips for large‑scale deployments.

ClusterGossip ProtocolReplication
0 likes · 20 min read
Mastering Redis Cluster: Deep Dive into Sharding, Failover, and Scaling
Inke Technology
Inke Technology
Dec 19, 2022 · Backend Development

How to Build a Highly Available, Stable, and Observable SMS Service

This article explains how to design a high‑availability SMS system by identifying stability bottlenecks, defining reliability goals, implementing failover strategies for Redis, MySQL and external services, establishing a comprehensive observability framework, and measuring key quality metrics to ensure 99.99% uptime.

BackendObservabilitySMS
0 likes · 11 min read
How to Build a Highly Available, Stable, and Observable SMS Service
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 24, 2022 · Databases

Understanding Orchestrator's RegroupReplicasGTID and Candidate Replica Selection in MySQL Failover

This article explains how Orchestrator selects a candidate replica during MySQL master failover, detailing the GetCandidateReplica and RegroupReplicasGTID functions, their sorting logic, promotion rules, GTID-based regrouping, and differences from MHA, while highlighting potential data loss issues and related bugs.

GTIDOrchestratorReplication
0 likes · 22 min read
Understanding Orchestrator's RegroupReplicasGTID and Candidate Replica Selection in MySQL Failover
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 17, 2022 · Databases

DeadMaster Recovery Process in Orchestrator

This article explains the complete DeadMaster recovery workflow of Orchestrator, detailing how the system selects the appropriate check‑and‑recover function, handles emergency grace periods, reads topology information, registers recovery attempts, validates promotion constraints, executes the actual failover, and runs post‑recovery hooks, with extensive Go code examples.

GoOrchestratorRecovery
0 likes · 18 min read
DeadMaster Recovery Process in Orchestrator
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 7, 2022 · Databases

Orchestrator Failover Process Source Code Analysis – Simulating Faults and Understanding ContinuousDiscovery

This article walks through a simulated MySQL 3307 cluster failure, examines Orchestrator's source code to explain the ContinuousDiscovery loop, discovery queues, health ticks, caretaking tasks, raft coordination, topology snapshots, and the logic distinguishing UnreachableMaster from DeadMaster states.

ContinuousDiscoveryDatabase HAGo
0 likes · 20 min read
Orchestrator Failover Process Source Code Analysis – Simulating Faults and Understanding ContinuousDiscovery
Laravel Tech Community
Laravel Tech Community
May 30, 2022 · Backend Development

Highlights of Apache Pulsar 2.10.0 Release: New Features and Bug Fixes

The Apache Pulsar 2.10.0 release introduces automatic cluster failover, lazy‑loading producers, new TableView support, enhanced broker interceptors, enriched client authentication, Etcd metadata storage, and numerous bug fixes, offering developers and operators a more flexible and performant messaging platform.

Apache PulsarBrokerMessaging
0 likes · 7 min read
Highlights of Apache Pulsar 2.10.0 Release: New Features and Bug Fixes
Architect's Alchemy Furnace
Architect's Alchemy Furnace
May 10, 2022 · Operations

How to Build Truly High‑Availability Systems: Redundancy, Failover, and Layered Architecture

High availability (HA) is essential for distributed systems, requiring redundancy and automatic failover across each architectural layer—from client to proxy, gateway, business logic, cache, and storage—to minimize downtime, achieve desired “nines” of uptime, and prevent cascading failures such as service snowballing.

Distributed SystemsSystem Architecturefailover
0 likes · 14 min read
How to Build Truly High‑Availability Systems: Redundancy, Failover, and Layered Architecture
Efficient Ops
Efficient Ops
Mar 6, 2022 · Operations

Mastering Redis Sentinel: Build High‑Availability Clusters Step‑by‑Step

This article explains Redis Sentinel’s role in achieving high availability, details its core functions, underlying Raft‑based algorithm, configuration parameters, practical setup steps, fault‑tolerance mechanisms, quorum and majority calculations, and demonstrates failover and recovery scenarios with real command‑line examples.

failoverhigh availabilityredis
0 likes · 20 min read
Mastering Redis Sentinel: Build High‑Availability Clusters Step‑by‑Step
dbaplus Community
dbaplus Community
Mar 1, 2022 · Databases

MHA Re-Edition: Modern MySQL HA with GTID Failover and Auto Switch

The MHA Re-Edition tool revives the discontinued MHA manager for MySQL, adding GTID‑based failover, password‑only SSH authentication, lightweight binaries, VIP migration, WeChat alerts, remote‑card reboot, and detailed configuration options, with step‑by‑step deployment instructions and sample app1.cnf parameters for high‑availability clusters.

GTIDMHAdatabase
0 likes · 11 min read
MHA Re-Edition: Modern MySQL HA with GTID Failover and Auto Switch
Aikesheng Open Source Community
Aikesheng Open Source Community
Jan 5, 2022 · Databases

Understanding ProxySQL Configuration Tables for MySQL HA (Read/Write Splitting and Failover)

This article explains ProxySQL's built‑in databases, key configuration tables such as mysql_servers, mysql_users, mysql_replication_hostgroups, mysql_group_replication_hostgroups, and mysql_query_rules, and demonstrates how to set up read/write splitting and automatic failover for MySQL primary‑replica and group replication environments.

DatabaseProxyHAProxySQL
0 likes · 14 min read
Understanding ProxySQL Configuration Tables for MySQL HA (Read/Write Splitting and Failover)
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 22, 2021 · Databases

Configuring ProxySQL with MySQL Replication and Group Replication for Read/Write Splitting and Automatic Failover

This guide demonstrates how to deploy a ProxySQL instance alongside six MySQL servers (three for traditional replication and three for MySQL Group Replication), configure users, set up read/write splitting rules, and enable automatic failover for both replication topologies.

Database HAGroup ReplicationMySQL replication
0 likes · 14 min read
Configuring ProxySQL with MySQL Replication and Group Replication for Read/Write Splitting and Automatic Failover
IT Architects Alliance
IT Architects Alliance
Dec 11, 2021 · Databases

Mastering Redis Replication and Sentinel: Solving Failover Challenges

This article examines the limitations of Redis master‑slave replication, explains how Redis Sentinel addresses those issues with monitoring, notification, and automatic failover, and provides detailed configuration commands, discovery mechanisms, and step‑by‑step failover procedures for building a highly available Redis deployment.

ConfigurationReplicationdatabase
0 likes · 12 min read
Mastering Redis Replication and Sentinel: Solving Failover Challenges
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Nov 12, 2021 · Databases

Implementing High‑Availability PostgreSQL with Keepalived: Architecture, Setup, and Failover Procedures

This article explains how to use Keepalived together with PostgreSQL to build a two‑node high‑availability cluster, covering Keepalived's VRRP mechanism, host planning, installation steps, asynchronous master‑slave replication configuration, monitoring scripts, and detailed failover drills.

Database ReplicationPostgreSQLVRRP
0 likes · 20 min read
Implementing High‑Availability PostgreSQL with Keepalived: Architecture, Setup, and Failover Procedures
IT Architects Alliance
IT Architects Alliance
Oct 25, 2021 · Databases

Designing a High‑Availability Redis Service with Sentinel

This article explains how to build a highly available Redis service using Redis Sentinel, discusses common failure scenarios, compares several architectural options from a single instance to a three‑node Sentinel setup, and provides practical tips such as using virtual IPs for seamless client access.

architecturedatabasefailover
0 likes · 11 min read
Designing a High‑Availability Redis Service with Sentinel
Ops Development Stories
Ops Development Stories
Sep 17, 2021 · Operations

Master Keepalived: Build Reliable Linux Load‑Balancing and HA

This guide explains Keepalived’s role in Linux load‑balancing and high‑availability, covering its VRRP‑based architecture, core modules, layered operation, configuration syntax, practical deployment with Nginx, common split‑brain issues, and advanced settings such as nopreempt and multicast conflict resolution.

HAVRRPfailover
0 likes · 21 min read
Master Keepalived: Build Reliable Linux Load‑Balancing and HA
Liangxu Linux
Liangxu Linux
Aug 22, 2021 · Operations

Build Nginx High Availability with Keepalived on Linux

This guide explains how to achieve high availability for Nginx by deploying a dual‑machine keepalived setup, covering the concepts of HA, VRRP, configuration of keepalived on master and backup nodes, a health‑check script, and step‑by‑step commands to test automatic failover.

LinuxVRRPfailover
0 likes · 9 min read
Build Nginx High Availability with Keepalived on Linux
IT Architects Alliance
IT Architects Alliance
Jun 20, 2021 · Databases

Master‑Slave Replication Pitfalls and Deep Dive into Redis Sentinel

This article examines the limitations of Redis master‑slave replication, such as manual failover and single‑node bottlenecks, and provides an in‑depth exploration of Redis Sentinel’s architecture, configuration parameters, detection mechanisms, automatic failover process, and best‑practice recommendations for achieving high availability.

Replicationdatabasefailover
0 likes · 11 min read
Master‑Slave Replication Pitfalls and Deep Dive into Redis Sentinel
Liangxu Linux
Liangxu Linux
May 27, 2021 · Operations

How I Built an Automated Redis Sentinel to Seamlessly Handle Failover

A sysadmin narrates how he monitors four Redis nodes, detects master failure with PING, promotes a slave using SLAVEOF, reconfigures the remaining replicas, and ultimately automates the entire process with a custom Sentinel program and a multi‑node Sentinel cluster for high availability.

AutomationCOperations
0 likes · 11 min read
How I Built an Automated Redis Sentinel to Seamlessly Handle Failover
ITPUB
ITPUB
May 19, 2021 · Databases

Mastering SQL Server AlwaysOn: Enterprise‑Ready High Availability Architecture

This article explains SQL Server's evolution from legacy high‑availability solutions to the modern AlwaysOn architecture, detailing its data‑synchronization process, synchronous and asynchronous commit modes, failover scenarios, and practical deployment recommendations for enterprises handling both moderate and terabyte‑scale workloads.

AlwaysOnDatabase ReplicationSQL Server
0 likes · 8 min read
Mastering SQL Server AlwaysOn: Enterprise‑Ready High Availability Architecture
Full-Stack Internet Architecture
Full-Stack Internet Architecture
May 13, 2021 · Databases

Database High‑Availability Architectures: Master‑Slave, Master‑Master, and Automatic Failover

This article explains common database high‑availability designs—including master‑slave, master‑master, and automatic failover architectures—their topologies, advantages, disadvantages, and practical considerations such as replication lag, manual intervention, and data consistency challenges.

Master‑SlaveReplicationdatabase
0 likes · 7 min read
Database High‑Availability Architectures: Master‑Slave, Master‑Master, and Automatic Failover
macrozheng
macrozheng
May 6, 2021 · Operations

How I Built an Automated Redis Sentinel System to Handle Failover

An operations engineer narrates how he monitors a four‑node Redis cluster, detects master failure with continuous PINGs, promotes a slave to master, reconfigures replicas, and automates the entire process with a sentinel program and a sentinel cluster for high availability.

Automationfailovermonitoring
0 likes · 11 min read
How I Built an Automated Redis Sentinel System to Handle Failover
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 24, 2021 · Databases

Deep Dive into Redis Cluster Architecture and Principles

This article provides a comprehensive analysis of Redis Cluster, covering node and slot assignment, command execution, resharding, redirection, fault‑tolerance, gossip communication, scaling strategies, configuration limits, and practical code examples for building and operating a high‑availability sharded Redis deployment.

ClusterGossip Protocolfailover
0 likes · 21 min read
Deep Dive into Redis Cluster Architecture and Principles
vivo Internet Technology
vivo Internet Technology
Apr 21, 2021 · Operations

System Health Check: Principles and Implementation

System health checks, akin to medical exams, are vital for modern IT infrastructure, using active and passive monitoring, failover strategies, and tools like Spring Boot Actuator to detect hardware, network, load, or software issues, prevent single points of failure, and ensure continuous high‑availability service operation.

Network ReliabilityRocketMQSpring Boot Actuator
0 likes · 12 min read
System Health Check: Principles and Implementation
Top Architect
Top Architect
Apr 12, 2021 · Databases

Designing a High‑Availability Redis Service with Sentinel

This article explains how to build a highly available Redis service by analyzing common failure scenarios, evaluating single‑instance, master‑slave with one or multiple Sentinel processes, and ultimately recommending a three‑Sentinel architecture combined with a virtual IP for seamless client usage.

Master‑Slavefailoverhigh availability
0 likes · 11 min read
Designing a High‑Availability Redis Service with Sentinel
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 2, 2021 · Operations

Understanding Redis Sentinel: High‑Availability Mechanism and Automatic Failover

This article explains how Redis Sentinel provides high‑availability for Redis by continuously monitoring master and replica nodes, detecting failures through subjective and objective down states, electing a new master via quorum‑based voting, and notifying clients of the failover using Pub/Sub events.

Replicationfailoverhigh availability
0 likes · 19 min read
Understanding Redis Sentinel: High‑Availability Mechanism and Automatic Failover
DataFunTalk
DataFunTalk
Mar 21, 2021 · Big Data

Single‑Point Recovery and Regional Checkpoint in Flink: Design, Implementation, and Optimizations

This article presents ByteDance's recent Flink enhancements, detailing a single‑point recovery mechanism for the network layer and a regional checkpoint strategy that together improve failover latency, reduce output loss, and enable scalable, high‑throughput stream processing for large‑scale real‑time recommendation workloads.

Big DataCheckpointFlink
0 likes · 12 min read
Single‑Point Recovery and Regional Checkpoint in Flink: Design, Implementation, and Optimizations
21CTO
21CTO
Feb 1, 2021 · Databases

Mastering Redis Cluster: Step‑by‑Step Setup, Scaling, and Failover Guide

This tutorial walks through building a Redis Cluster on Redis 6.0+, covering node startup, handshaking, slot assignment, master‑slave replication, command routing, failover handling, and practical scaling operations such as adding, rebalancing, and removing nodes using redis‑cli commands.

CLIClusterfailover
0 likes · 22 min read
Mastering Redis Cluster: Step‑by‑Step Setup, Scaling, and Failover Guide
Top Architect
Top Architect
Dec 9, 2020 · Operations

Designing High Availability for Redis Using Sentinel

This article explains how Redis Sentinel provides high‑availability for Redis clusters by monitoring masters and slaves, automatically failing over to a new master, and offering three methods for receiving failover notifications, while recommending an indirect‑service approach for scalable integration.

ConfigurationOperationsfailover
0 likes · 7 min read
Designing High Availability for Redis Using Sentinel
macrozheng
macrozheng
Nov 20, 2020 · Operations

How Redis Achieves High Availability: A Story of Replication and Failover

This article narrates how Redis, personified as a character, synchronizes data between master and slave nodes through command propagation, adopts improved sync strategies, and uses Sentinel's INFO and PING checks to detect failures and automatically trigger failover, illustrating a practical high‑availability cache service.

Replicationfailoverredis
0 likes · 4 min read
How Redis Achieves High Availability: A Story of Replication and Failover
Aotu Lab
Aotu Lab
Nov 16, 2020 · Databases

Mastering MongoDB Replica Sets: Setup, Configuration, and Read/Write Strategies

This guide explains MongoDB replica set fundamentals, walks through local deployment of a primary‑secondary‑secondary cluster, demonstrates automatic failover, and details write concern and read preference options—including member attributes, connection string parameters, and shell commands—for reliable high‑availability data handling.

Database ConfigurationMongoDBRead Preference
0 likes · 17 min read
Mastering MongoDB Replica Sets: Setup, Configuration, and Read/Write Strategies
Aikesheng Open Source Community
Aikesheng Open Source Community
Oct 30, 2020 · Databases

Configuring MySQL MGR with Asynchronous Replication Automatic Failover for Multi‑Site Disaster Recovery

This article explains how MySQL Group Replication (MGR) can provide zero‑RPO high‑availability within a city‑scale data center, why it needs asynchronous replication for WAN‑scale disaster recovery, and walks through a step‑by‑step setup—including code examples—for automatic failover of asynchronous replication channels.

Asynchronous ReplicationMGRdatabase high availability
0 likes · 6 min read
Configuring MySQL MGR with Asynchronous Replication Automatic Failover for Multi‑Site Disaster Recovery
Top Architect
Top Architect
Oct 22, 2020 · Databases

Designing High Availability for Redis with Sentinel

This article explains how to use Redis Sentinel to achieve high availability, covering its core functions, configuration steps, three methods of receiving failover notifications, and a recommended overall design with diagrams and code examples.

Configurationdatabasefailover
0 likes · 9 min read
Designing High Availability for Redis with Sentinel
Programmer DD
Programmer DD
Sep 27, 2020 · Databases

How Redis Sentinel Guarantees Automatic Failover and High Availability

This article explains how Redis Sentinel provides automatic failover and high availability by using master‑slave replication, multiple Sentinel nodes, consensus‑based leader election, and client notification mechanisms to ensure continuous service despite node failures.

Replicationdatabasefailover
0 likes · 11 min read
How Redis Sentinel Guarantees Automatic Failover and High Availability
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 21, 2020 · Databases

MySQL 8.0.20 Group Replication Overview and Practical Guide

This article introduces MySQL 8.0.20 Group Replication, covering single‑master and multi‑master modes, monitoring, failover procedures, abnormal recovery, flow control, performance testing, encountered issues, and limitations, and provides a downloadable PDF with detailed documentation hosted on Baidu Cloud.

Baidu CloudGroup ReplicationPerformance Testing
0 likes · 1 min read
MySQL 8.0.20 Group Replication Overview and Practical Guide