Tagged articles

High Availability

1447 articles · Page 2 of 15
dbaplus Community
dbaplus Community
Oct 5, 2025 · Cloud Native

Binary Deployment vs kubeadm: Which Kubernetes Setup Fits Your Enterprise?

This article compares manual binary deployment and kubeadm‑based installation of Kubernetes, covering core architectural differences, high‑availability designs, upgrade procedures, security models, enterprise scenario‑driven selection criteria, practical implementation steps, and concluding recommendations for choosing the most suitable approach.

EnterpriseHigh AvailabilityUpgrade
0 likes · 14 min read
Binary Deployment vs kubeadm: Which Kubernetes Setup Fits Your Enterprise?
Architecture Breakthrough
Architecture Breakthrough
Sep 28, 2025 · Operations

How to Build an Organizational High‑Availability Mechanism for Banking IT Production Issues

This article outlines a comprehensive, step‑by‑step framework for establishing a high‑availability system in large‑scale banking IT, covering goal definition, logical architecture, service classification, key activity identification, capability upgrades, monitoring, emergency‑response asset creation, technical debt tracking, and periodic post‑mortem redesign.

High AvailabilityMonitoringOperations
0 likes · 10 min read
How to Build an Organizational High‑Availability Mechanism for Banking IT Production Issues
Ray's Galactic Tech
Ray's Galactic Tech
Sep 27, 2025 · Databases

Master PostgreSQL Streaming Replication: Step‑by‑Step Setup Guide

This comprehensive guide explains PostgreSQL streaming replication concepts, required environment, primary and standby configuration commands, verification queries, failover procedures, and production best‑practice recommendations, enabling you to build a reliable high‑availability database cluster.

High AvailabilityPostgreSQLStreaming Replication
0 likes · 7 min read
Master PostgreSQL Streaming Replication: Step‑by‑Step Setup Guide
JD Tech
JD Tech
Sep 26, 2025 · Operations

Avoiding High‑Availability Pitfalls: Real‑World JD Lessons and Solutions

This article examines common high‑availability challenges across applications, databases, caches, message queues, containers, and GC, presenting real JD engineering cases, root‑cause analyses, and practical mitigation strategies to help engineers design more resilient systems.

High AvailabilityMessage QueueRedis
0 likes · 37 min read
Avoiding High‑Availability Pitfalls: Real‑World JD Lessons and Solutions
Wukong Talks Architecture
Wukong Talks Architecture
Sep 24, 2025 · Databases

How Meiyou Scaled Overseas Messaging with TiDB Architecture

Meiyou, a leading women‑health platform, migrated its overseas messaging system and other core services from MySQL to TiDB, detailing the selection process, performance testing, deployment configurations, and the resulting gains in scalability, latency, high availability, and reduced operational costs.

High AvailabilityTiDBdatabase migration
0 likes · 12 min read
How Meiyou Scaled Overseas Messaging with TiDB Architecture
Raymond Ops
Raymond Ops
Sep 22, 2025 · Databases

Master‑Slave, Sentinel, and Sharding: Complete Guide to Redis Cluster Architectures

This article explains Redis’s three clustering options—master‑slave replication, Sentinel high‑availability, and sharding—detailing their architectures, setup steps, synchronization mechanisms, advantages, drawbacks, and common interview questions, helping readers choose and implement the right solution for high‑performance, scalable data storage.

High AvailabilityRedisSentinel
0 likes · 18 min read
Master‑Slave, Sentinel, and Sharding: Complete Guide to Redis Cluster Architectures
MaGe Linux Operations
MaGe Linux Operations
Sep 22, 2025 · Databases

Redis Ops Survival Guide: From Data Loss Nightmares to Mastering High‑Availability

This comprehensive guide walks you through real‑world Redis failure stories, explains why Redis is a critical backbone for modern applications, and provides step‑by‑step high‑availability designs, troubleshooting mind maps, monitoring setups, security hardening, automation scripts, cloud‑native deployments, and future‑proofing tips for engineers.

High AvailabilityPerformance TuningRedis
0 likes · 35 min read
Redis Ops Survival Guide: From Data Loss Nightmares to Mastering High‑Availability
Ops Community
Ops Community
Sep 19, 2025 · Operations

From Midnight Outage to Zero Downtime: Mastering NFS High‑Availability

This article recounts a critical NFS failure that caused massive loss, then walks through practical high‑availability designs—including Keepalived + DRBD, GlusterFS migration, and cloud‑native CSI storage—while sharing real‑world pitfalls, monitoring strategies, and forward‑looking recommendations for resilient file‑system operations.

Distributed File SystemHigh AvailabilityMonitoring
0 likes · 12 min read
From Midnight Outage to Zero Downtime: Mastering NFS High‑Availability
Tech Freedom Circle
Tech Freedom Circle
Sep 19, 2025 · Interview Experience

Designing a Rock‑Solid High‑Availability Solution for Unreliable Third‑Party Services

When third‑party services frequently fail, this article walks through a systematic high‑availability design—including an ACL anti‑corruption layer, strategy‑pattern master‑slave routing, precise rate limiting, circuit‑breaker fallback, full observability, async degradation, and mock testing—to keep external dependencies as stable as a mountain.

ACLHigh AvailabilityStrategy Pattern
0 likes · 24 min read
Designing a Rock‑Solid High‑Availability Solution for Unreliable Third‑Party Services
Ops Community
Ops Community
Sep 17, 2025 · Operations

Mastering System Fault Tolerance: From Theory to Production‑Ready High‑Availability

This comprehensive guide explores the philosophy, core patterns, and practical techniques for designing fault‑tolerant, highly available systems, covering circuit breakers, retries, rate limiting, monitoring, cloud‑native deployment, and real‑world case studies to help engineers build resilient production architectures.

Cloud NativeHigh Availabilitycircuit breaker
0 likes · 24 min read
Mastering System Fault Tolerance: From Theory to Production‑Ready High‑Availability
Raymond Ops
Raymond Ops
Sep 16, 2025 · Cloud Native

How to Build a Secure High‑Availability Etcd Cluster on Linux

This guide walks through installing etcd, configuring a three‑node high‑availability cluster with TLS certificates, setting up host files, disabling SELinux and firewalld, creating a Certificate Authority using cfssl, generating node certificates, distributing them, and finally deploying and verifying the cluster on Linux systems.

CertificateCloud NativeEtcd
0 likes · 19 min read
How to Build a Secure High‑Availability Etcd Cluster on Linux
Raymond Ops
Raymond Ops
Sep 13, 2025 · Operations

How to Build a High‑Availability RabbitMQ Cluster on CentOS with Docker

This guide walks through the full process of analyzing requirements, selecting self‑hosted servers, preparing CentOS nodes, installing Docker and Docker‑Compose, configuring RabbitMQ, and deploying a three‑node high‑availability RabbitMQ cluster with detailed commands and configuration files.

DockerDocker ComposeHigh Availability
0 likes · 12 min read
How to Build a High‑Availability RabbitMQ Cluster on CentOS with Docker
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Sep 11, 2025 · Operations

Mastering Load Balancing: Single, Dual, and Multi‑Layer Architectures Explained

This article explains the fundamentals of load balancing, describing single‑layer, dual‑layer, and multi‑layer architectures, their advantages, disadvantages, and suitable scenarios, helping readers choose the right design based on traffic volume, availability, security, topology, budget, and operational capabilities.

High AvailabilityOperationsload balancing
0 likes · 6 min read
Mastering Load Balancing: Single, Dual, and Multi‑Layer Architectures Explained
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 10, 2025 · Databases

Master SQL Server Operations: From Installation to High‑Availability

The Aikexing open‑source community announces a giveaway of the technical book “SQL Server Operations Guide”, detailing its four‑part content on installation, performance tuning, security, multimodal data, and high‑availability architecture, authored by veteran DBA Lin Yonghua, and inviting beginners, developers, and educators to participate.

Book GiveawayDatabase AdministrationHigh Availability
0 likes · 12 min read
Master SQL Server Operations: From Installation to High‑Availability
MaGe Linux Operations
MaGe Linux Operations
Sep 8, 2025 · Big Data

Build Enterprise‑Grade HDFS HA and Optimize YARN Scheduling from Scratch

This comprehensive guide walks you through constructing a fault‑tolerant HDFS high‑availability architecture, configuring dual NameNodes with ZooKeeper and JournalNode clusters, fine‑tuning YARN resource schedulers, implementing monitoring, automated failover testing, and performance optimization, all backed by real‑world production experiences and code examples.

Big Data OperationsHDFSHigh Availability
0 likes · 24 min read
Build Enterprise‑Grade HDFS HA and Optimize YARN Scheduling from Scratch
MaGe Linux Operations
MaGe Linux Operations
Sep 6, 2025 · Databases

How to Build a High‑Availability MySQL Master‑Slave Cluster and Automate Failover

This guide walks through the reasons for MySQL master‑slave replication, explains its core mechanisms, details step‑by‑step environment planning, configuration, data initialization, replication setup, monitoring, failover with MHA, read‑write splitting using ProxySQL, performance tuning, troubleshooting, and best‑practice recommendations for enterprise‑grade high availability.

High AvailabilityPerformance Tuningfailover
0 likes · 27 min read
How to Build a High‑Availability MySQL Master‑Slave Cluster and Automate Failover
Raymond Ops
Raymond Ops
Sep 5, 2025 · Databases

Why Redis Needs a Cluster: Step‑by‑Step Setup, Configuration & Best Practices

This guide explains the need for Redis clustering to achieve high availability, walks through Redis 3.0's decentralized cluster configuration, shows how to modify redis.conf, start multiple nodes, create the cluster, use hash slots, handle failures, and connect via Java Jedis, highlighting both advantages and limitations.

ConfigurationHigh AvailabilityJava
0 likes · 13 min read
Why Redis Needs a Cluster: Step‑by‑Step Setup, Configuration & Best Practices
JD Tech Talk
JD Tech Talk
Sep 4, 2025 · Operations

Avoid Common High‑Availability Pitfalls: Real‑World JD Practices and Solutions

This article analyzes the multi‑dimensional challenges of building high‑availability systems—covering applications, databases, caches, message queues, containers, GC, and more—by sharing real JD engineering scenarios, common failure patterns, and concrete mitigation strategies to help engineers design more resilient services.

High Availabilitybackenddistributed systems
0 likes · 36 min read
Avoid Common High‑Availability Pitfalls: Real‑World JD Practices and Solutions
JD Cloud Developers
JD Cloud Developers
Sep 4, 2025 · Operations

Mastering High‑Availability: JD Real‑World Pitfalls & Fixes for Apps, DBs, Cache & MQ

This article shares JD's practical high‑availability architecture lessons, detailing common pitfalls across applications, databases, caches, RPC frameworks, containers, data centers, GC, and message queues, and provides concrete troubleshooting steps and optimization techniques to help engineers design more resilient, fault‑tolerant systems.

High AvailabilitySystem Designbackend
0 likes · 36 min read
Mastering High‑Availability: JD Real‑World Pitfalls & Fixes for Apps, DBs, Cache & MQ
JD Retail Technology
JD Retail Technology
Sep 4, 2025 · Operations

Mastering High Availability: Real-World Pitfalls and Solutions from JD's Production Systems

This article walks through the challenges of building high‑availability systems—covering applications, databases, caches, message queues, containers, GC, and more—using JD’s production experiences to highlight common pitfalls, root‑cause analyses, and practical mitigation strategies for engineers seeking resilient architecture.

CacheHigh AvailabilityJDK
0 likes · 37 min read
Mastering High Availability: Real-World Pitfalls and Solutions from JD's Production Systems
Raymond Ops
Raymond Ops
Sep 1, 2025 · Operations

Mastering Keepalived: A Complete Guide to VRRP‑Based High Availability with LVS

This tutorial explains how Keepalived provides targeted high‑availability for LVS clusters by implementing VRRP, details its architecture, walks through installation, configuration of VRRP and virtual servers, shows health‑check scripts, and demonstrates testing of fail‑over and load‑balancing behavior.

High AvailabilityIPVSKeepalived
0 likes · 16 min read
Mastering Keepalived: A Complete Guide to VRRP‑Based High Availability with LVS
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Aug 28, 2025 · Cloud Computing

How VPC Private DNS Powers Secure, Scalable Cloud Networks

VPC private DNS provides an isolated, internal name resolution service for cloud resources, enabling secure, efficient communication, private domain management, recursive queries, and seamless integration with public DNS, while offering advantages such as enhanced security, flexible architecture, simplified operations, high availability, and support for hybrid cloud scenarios.

Cloud NetworkingHigh AvailabilityHybrid Cloud
0 likes · 12 min read
How VPC Private DNS Powers Secure, Scalable Cloud Networks
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Aug 27, 2025 · Databases

How RedHub Revolutionizes Database Access for Billion‑User Scale

RedHub is a next‑generation database proxy built by Xiaohongshu that unifies fragmented access methods, leverages PolarDB‑X for distributed SQL execution, and delivers high‑performance, highly available, and easily observable database connectivity, enabling seamless migration and advanced features for massive‑scale workloads.

Database ProxyDistributed SQLHigh Availability
0 likes · 15 min read
How RedHub Revolutionizes Database Access for Billion‑User Scale
Ops Community
Ops Community
Aug 26, 2025 · Databases

5 Redis High‑Availability Architectures – Why Most Fail and the Hidden Solution

This article examines why single‑node Redis is a reliability nightmare, then rigorously evaluates five high‑availability architectures—including Sentinel, Redis Cluster, Codis, Redis Enterprise, and cloud‑native services—detailing their scenarios, pros, cons, performance metrics, deployment scripts, monitoring setups, and a decision‑making guide to help you choose the optimal solution.

High AvailabilitySentinelcluster
0 likes · 14 min read
5 Redis High‑Availability Architectures – Why Most Fail and the Hidden Solution
Tech Freedom Circle
Tech Freedom Circle
Aug 24, 2025 · Operations

How a Misconfigured Nacos Cluster Cost $170 Million: A Deep P0 Incident Postmortem

A leading financial platform suffered a six‑hour outage and $170 million loss when its Nacos service‑registry cluster entered a split‑brain state due to network partition, exposing flaws in AP‑mode deployment, monitoring gaps, and cascading failures that were later resolved through Raft migration, multi‑active architecture, and client‑side resilience.

High AvailabilityMicroservicesOperations
0 likes · 32 min read
How a Misconfigured Nacos Cluster Cost $170 Million: A Deep P0 Incident Postmortem
Ops Community
Ops Community
Aug 20, 2025 · Databases

How MySQL Master‑Slave Replication and Read‑Write Splitting Turn a Single Server into a High‑Availability Architecture

This article walks through why a single MySQL instance often fails under load, explains the fundamentals of asynchronous master‑slave replication and read‑write splitting, provides step‑by‑step configuration scripts, highlights common pitfalls with solutions, and shows advanced optimization and monitoring techniques for building a scalable, high‑availability MySQL architecture.

High AvailabilityProxySQLRead‑Write Splitting
0 likes · 16 min read
How MySQL Master‑Slave Replication and Read‑Write Splitting Turn a Single Server into a High‑Availability Architecture
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Aug 20, 2025 · Cloud Computing

How Alibaba Cloud Achieves Rock‑Solid IaaS Stability: Design Principles, Metrics, and Engineering Practices

This article explains Alibaba Cloud's comprehensive approach to IaaS stability, covering shared responsibility with customers, availability metrics, design principles, compute, storage, and network engineering practices that together deliver rock‑solid reliability for millions of workloads.

High AvailabilityIaaSSystem Design
0 likes · 56 min read
How Alibaba Cloud Achieves Rock‑Solid IaaS Stability: Design Principles, Metrics, and Engineering Practices
Alibaba Cloud Native
Alibaba Cloud Native
Aug 19, 2025 · Artificial Intelligence

Boost Dify AI App Performance with Higress AI Gateway: A Full-Scale High‑Availability Guide

This guide explains why Dify’s system components and model services become performance bottlenecks at scale, and how integrating the Higress AI gateway can provide protocol standardization, observability, security, and stability features to achieve full‑stack high availability for AI applications.

AI GatewayCloud NativeDify
0 likes · 16 min read
Boost Dify AI App Performance with Higress AI Gateway: A Full-Scale High‑Availability Guide
MaGe Linux Operations
MaGe Linux Operations
Aug 14, 2025 · Backend Development

Designing Enterprise‑Grade RabbitMQ High‑Availability: Architecture & Best Practices

This article explores why high availability is critical for RabbitMQ in micro‑service environments, presents a full HA architecture diagram, compares cluster modes, details mirror‑queue and quorum‑queue configurations, walks through production‑grade setup steps, performance tuning, monitoring, network‑partition handling, failover procedures, and shares practical lessons learned.

High AvailabilityRabbitMQcluster
0 likes · 14 min read
Designing Enterprise‑Grade RabbitMQ High‑Availability: Architecture & Best Practices
Raymond Ops
Raymond Ops
Aug 11, 2025 · Operations

Mastering Redis Sentinel: Automatic Failover and High Availability Explained

This article provides a comprehensive guide to Redis Sentinel, covering its purpose, architecture, monitoring functions, discovery mechanisms, failover process, leader election, configuration options, and practical commands for achieving reliable high‑availability in Redis deployments.

High AvailabilityOperationsRedis
0 likes · 17 min read
Mastering Redis Sentinel: Automatic Failover and High Availability Explained
DevOps Operations Practice
DevOps Operations Practice
Aug 7, 2025 · Operations

Mastering Operations: Tools, Processes, and Architecture for Top‑Notch SRE

This article outlines how proactive monitoring, automation, disciplined processes, robust architecture, and chaos engineering empower operations engineers to prevent failures, manage changes, ensure reliable backups, and build self‑healing systems that balance stability, innovation, cost, and human decision‑making.

AutomationChange ManagementHigh Availability
0 likes · 5 min read
Mastering Operations: Tools, Processes, and Architecture for Top‑Notch SRE
StarRocks
StarRocks
Aug 6, 2025 · Databases

How Qunar Migrated to StarRocks: Architecture, Performance Gains & Best Practices

This article details Qunar's transition to StarRocks as a unified OLAP engine, covering the business background, engine evaluation, architecture redesign, observability, high‑availability strategies, query‑performance optimizations, real‑world application cases, community contributions, and future plans.

Data PlatformHigh AvailabilityOLAP
0 likes · 21 min read
How Qunar Migrated to StarRocks: Architecture, Performance Gains & Best Practices
Tech Freedom Circle
Tech Freedom Circle
Aug 4, 2025 · Operations

How Do Projects Achieve High Availability Without Multi‑Site Active‑Active? – A Meituan Interview Question

The article analyzes high‑availability concepts, from single‑machine risks to multi‑site active‑active architectures, compares cold and hot backup strategies, discusses network latency challenges, and presents Ele.me’s cell‑based, sharding‑driven multi‑region solution with concrete examples, tables, and code snippets.

Data ReplicationDisaster RecoveryHigh Availability
0 likes · 28 min read
How Do Projects Achieve High Availability Without Multi‑Site Active‑Active? – A Meituan Interview Question
MaGe Linux Operations
MaGe Linux Operations
Aug 3, 2025 · Operations

Avoid 3 Hidden Nginx+Keepalived HA Pitfalls That 90% of Ops Encounter

This article reveals three hard‑to‑detect pitfalls in Nginx + Keepalived high‑availability setups—split‑brain caused by network partitions, inadequate health‑check scripts, and unsafe configuration‑sync timing—provides real‑world incident examples, and offers complete, battle‑tested solutions with ready‑to‑use scripts.

Configuration SyncHigh AvailabilityKeepalived
0 likes · 16 min read
Avoid 3 Hidden Nginx+Keepalived HA Pitfalls That 90% of Ops Encounter
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jul 30, 2025 · Databases

Seamless Multi-DataCenter Database Migration: Strategies and Domain Scheduling

Learn how to execute a zero‑downtime, risk‑controlled database migration across data centers using pre‑expansion, cross‑room master switch, intelligent domain scheduling, and step‑by‑step operational guides—including VIP handling, global vs. zone‑specific domains, and post‑migration validation—to ensure continuous service and optimal resource elasticity.

Domain SchedulingHigh AvailabilityZero Downtime
0 likes · 13 min read
Seamless Multi-DataCenter Database Migration: Strategies and Domain Scheduling
MaGe Linux Operations
MaGe Linux Operations
Jul 27, 2025 · Databases

Master MySQL Performance Tuning & Troubleshooting on Linux: A Complete Guide

This comprehensive guide walks you through why MySQL performance matters, how to benchmark and establish baselines, apply Linux system and MySQL configuration optimizations, fine‑tune SQL queries, diagnose common failures, set up robust monitoring, and implement high‑availability architectures for production environments.

High AvailabilityPerformance Tuningdatabase optimization
0 likes · 18 min read
Master MySQL Performance Tuning & Troubleshooting on Linux: A Complete Guide
MaGe Linux Operations
MaGe Linux Operations
Jul 26, 2025 · Operations

How to Build a High‑Availability Prometheus Monitoring System: Pitfalls & Performance Tuning

This article walks you through building a production‑grade, highly available Prometheus monitoring system, covering architecture design with federation and sharding, common pitfalls such as memory bloat, query latency and storage growth, plus practical tuning, deployment, alerting and advanced optimization techniques.

High AvailabilityPerformance Tuningkubernetes
0 likes · 10 min read
How to Build a High‑Availability Prometheus Monitoring System: Pitfalls & Performance Tuning
MaGe Linux Operations
MaGe Linux Operations
Jul 22, 2025 · Operations

Build a Production-Ready Prometheus HA Architecture with Federation & Remote Storage

This guide walks through designing and implementing a robust, enterprise‑grade Prometheus high‑availability solution using federation clusters, remote storage back‑ends, Docker‑Compose deployments, health‑check scripts, and best‑practice recommendations for monitoring, security, and performance.

Docker ComposeFederationHigh Availability
0 likes · 17 min read
Build a Production-Ready Prometheus HA Architecture with Federation & Remote Storage
Programmer XiaoFu
Programmer XiaoFu
Jul 22, 2025 · Backend Development

Mastering Cache Penetration, Avalanche, and Breakdown: Interview-Ready Answers

The article explains the concepts of cache penetration, avalanche, and breakdown, outlines their typical causes such as invalid requests, synchronized expirations, and hotspot spikes, and presents practical mitigation techniques including request validation, caching null values, Bloom filters, staggered expirations, high‑availability Redis setups, mutex locks, and random TTLs.

CacheCache AvalancheCache Breakdown
0 likes · 8 min read
Mastering Cache Penetration, Avalanche, and Breakdown: Interview-Ready Answers
Architect's Guide
Architect's Guide
Jul 21, 2025 · Operations

How to Achieve Five Nines: Practical High‑Availability Strategies for Modern Web Systems

This article explains key high‑availability concepts such as availability metrics, microservice modularization, load balancing, rate limiting, circuit breaking, isolation, retry strategies, rollback plans, stress testing, monitoring, and on‑call processes, providing concrete design guidelines for building resilient internet services.

High AvailabilityMicroservicesMonitoring
0 likes · 12 min read
How to Achieve Five Nines: Practical High‑Availability Strategies for Modern Web Systems
Su San Talks Tech
Su San Talks Tech
Jul 19, 2025 · Operations

Mastering Load Balancing: Architecture, Algorithms, and Real-World Pitfalls

This article explores the four‑layer load‑balancing architecture, five common algorithms (including Round Robin, Weighted RR, Least Connections, Consistent Hashing, and AI‑driven adaptive load), high‑availability design, deep pitfalls, and a self‑built load balancer implementation, providing practical code examples and best‑practice guidelines.

High AvailabilityOperationsbackend-architecture
0 likes · 10 min read
Mastering Load Balancing: Architecture, Algorithms, and Real-World Pitfalls
macrozheng
macrozheng
Jul 12, 2025 · Databases

NewSQL vs Middleware Sharding: Which Architecture Truly Wins?

This article objectively compares middleware‑based sharding with NewSQL distributed databases, examining their architectures, transaction support, CAP implications, high‑availability, scaling, storage engines, and ecosystem maturity to help readers decide which solution best fits their workload.

CAP theoremHigh AvailabilityNewSQL
0 likes · 19 min read
NewSQL vs Middleware Sharding: Which Architecture Truly Wins?
Raymond Ops
Raymond Ops
Jul 11, 2025 · Operations

Mastering Keepalived: Complete Guide to High‑Availability Load Balancing

This tutorial explains Keepalived’s VRRP‑based failover, IPVS rule generation, health‑checking, script integration, installation methods, detailed configuration files, notification handling, logging, brain‑split prevention, and VRRP scripting for building robust high‑availability clusters on Linux.

High AvailabilityIPVSKeepalived
0 likes · 26 min read
Mastering Keepalived: Complete Guide to High‑Availability Load Balancing

Demystifying Consistency Models: From Linear to Eventual in Distributed Systems

This article explores the concept of consistency in distributed systems, breaking down various consistency models—including linear, sequential, causal, and eventual—explaining their definitions, practical implications, and how they guide the design of high‑availability architectures and data replication strategies.

Data ReplicationHigh Availabilityconsistency
0 likes · 13 min read
Demystifying Consistency Models: From Linear to Eventual in Distributed Systems
MaGe Linux Operations
MaGe Linux Operations
Jul 6, 2025 · Operations

Master Kafka Production: High‑Availability Cluster Deployment & Ops Best Practices

This comprehensive guide walks operations engineers through designing, deploying, and managing a high‑availability Kafka production cluster, covering automated ZooKeeper and Kafka installation scripts, performance tuning for producers and consumers, monitoring with Prometheus and Grafana, and automated health checks and recovery procedures.

High AvailabilityProduction Deployment
0 likes · 28 min read
Master Kafka Production: High‑Availability Cluster Deployment & Ops Best Practices
Lin is Dream
Lin is Dream
Jul 4, 2025 · Databases

Master Redis High Availability: Complete Guide to Sentinel and Cluster Deployment

This article explains why single‑node Redis can become a single point of failure, compares Redis Sentinel and Redis Cluster deployment options, provides step‑by‑step Docker deployment scripts, details Sentinel’s inner workings, demonstrates failover verification, and shares best‑practice recommendations for production environments.

High AvailabilityRedisSentinel
0 likes · 31 min read
Master Redis High Availability: Complete Guide to Sentinel and Cluster Deployment
Efficient Ops
Efficient Ops
Jun 21, 2025 · Operations

What a Lychee Delivery Tale Teaches About DevOps and Operations

Through a vivid analogy of transporting lychees to ancient Chang’an, the article illustrates how operations teams must negotiate SLAs, automate monitoring, design high‑availability pipelines, document responsibilities, and avoid the endless cycle of blame, offering practical DevOps strategies for managing zero‑budget, zero‑resource projects.

High AvailabilityOperations ManagementSLA
0 likes · 5 min read
What a Lychee Delivery Tale Teaches About DevOps and Operations
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Jun 10, 2025 · Operations

Mastering Load Balancing: From Single‑Layer to Billion‑Scale Architectures

This article explains the essential role of load balancing in modern distributed systems and walks through single‑layer, double‑layer, and billion‑scale architectures, highlighting their design principles, benefits, trade‑offs, and typical deployment scenarios for high‑availability and high‑performance applications.

High AvailabilityLVSNGINX
0 likes · 6 min read
Mastering Load Balancing: From Single‑Layer to Billion‑Scale Architectures
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Jun 9, 2025 · Operations

How Nginx Master‑Slave Architecture Ensures High Availability

This article explains how Nginx's master‑slave (primary‑backup) setup, combined with Keepalived and a virtual IP, provides high‑availability for web and API services by automatically detecting failures, shifting the VIP, and allowing the backup server to take over without service interruption.

High AvailabilityKeepalivedMaster‑Slave
0 likes · 4 min read
How Nginx Master‑Slave Architecture Ensures High Availability
Liangxu Linux
Liangxu Linux
Jun 5, 2025 · Databases

Choosing the Right MySQL HA Solution: MHA, Percona XtraDB Cluster, and Galera

An in‑depth comparison of three popular MySQL high‑availability architectures—MHA, Percona XtraDB Cluster (PXC), and Galera Cluster—covers their principles, architectures, strengths, limitations, deployment scenarios, and best‑practice recommendations to help you select the optimal solution for your production environment.

GaleraHigh AvailabilityMHA
0 likes · 10 min read
Choosing the Right MySQL HA Solution: MHA, Percona XtraDB Cluster, and Galera
Raymond Ops
Raymond Ops
Jun 4, 2025 · Operations

Mastering SFTP: Complete Planning, Configuration, and High‑Availability Guide

This guide walks you through SFTP server planning, user naming conventions, directory structures, SSH configuration, account creation, permission setup, client usage, log auditing, rotation, connection limits, monitoring, and high‑availability deployment across multiple servers, providing ready‑to‑run commands and scripts.

ACLHigh AvailabilityLinux
0 likes · 14 min read
Mastering SFTP: Complete Planning, Configuration, and High‑Availability Guide
Instant Consumer Technology Team
Instant Consumer Technology Team
Jun 4, 2025 · Databases

Achieving High Availability for MySQL & Redis on MaShang Cloud with Distributed Sentinel

This article explains MaShang Cloud's RDS high‑availability design, detailing the distributed sentinel monitoring system, proxy layer, multi‑AZ disaster‑recovery strategies, and real‑world case studies that demonstrate how MySQL and Redis services maintain continuous, consistent access with minimal RTO and RPO.

Database ProxyDistributed SentinelHigh Availability
0 likes · 16 min read
Achieving High Availability for MySQL & Redis on MaShang Cloud with Distributed Sentinel
MaGe Linux Operations
MaGe Linux Operations
Jun 2, 2025 · Operations

How to Deploy a High‑Availability MinIO Distributed Cluster on Rocky 9

This guide walks you through deploying a highly available MinIO distributed object storage cluster on Rocky 9, covering prerequisites, environment preparation, user and directory setup, configuration files, systemd service creation, testing, Nginx load balancing, and verification of cluster health.

Distributed storageHigh Availabilityminio
0 likes · 20 min read
How to Deploy a High‑Availability MinIO Distributed Cluster on Rocky 9
Amap Tech
Amap Tech
May 27, 2025 · Databases

OceanBase Unitization: Building the Next Generation of Online Map Applications

This paper presents the design, implementation, and experimental evaluation of OceanBase's unitization architecture for large‑scale online map services, demonstrating superior disaster‑recovery, high‑throughput OLTP/OLAP performance, and storage efficiency compared with competing distributed databases.

High AvailabilityOceanBaseOnline Maps
0 likes · 24 min read
OceanBase Unitization: Building the Next Generation of Online Map Applications
Xiaokun's Architecture Exploration Notes
Xiaokun's Architecture Exploration Notes
May 25, 2025 · Fundamentals

How Consensus, CAP, and BASE Shape High‑Availability Architecture

This article explains the role of consensus algorithms in achieving high‑availability through redundancy and automatic failover, clarifies distributed consistency, explores the CAP theorem and its C component, and introduces the BASE theory as a practical complement for eventual consistency in modern distributed systems.

BASE theoryCAP theoremConsensus
0 likes · 10 min read
How Consensus, CAP, and BASE Shape High‑Availability Architecture
IT Xianyu
IT Xianyu
May 20, 2025 · Operations

Building a Three‑Server High‑Availability MySQL Cluster with HAProxy on Almalinux

This guide explains why three servers are needed for high availability, walks through hardware and software preparation, network configuration, MySQL master‑slave replication setup, HAProxy load‑balancing, and firewall/SELinux adjustments, providing complete command‑line examples for each step.

AlmaLinuxHAProxyHigh Availability
0 likes · 8 min read
Building a Three‑Server High‑Availability MySQL Cluster with HAProxy on Almalinux
Tencent Technical Engineering
Tencent Technical Engineering
May 19, 2025 · Cloud Native

How Tencent’s TGW Delivers 3× Faster Throughput and Near‑Zero Downtime at Scale

The USENIX‑selected paper on Tencent’s TGW cloud gateway reveals how a modular, multi‑layer architecture achieves up to 2.9‑fold throughput gains, seconds‑level elastic scaling, loss‑less hot migration, and sub‑second fault recovery, offering a blueprint for resilient large‑scale cloud networking.

Cloud GatewayHigh AvailabilityNetwork Architecture
0 likes · 16 min read
How Tencent’s TGW Delivers 3× Faster Throughput and Near‑Zero Downtime at Scale
Architect
Architect
Apr 30, 2025 · Databases

Redis Core Architecture, Data Types, Persistence, High Availability, and Performance Optimization

This comprehensive guide explains Redis's core architecture, the underlying implementation of its various data types, persistence mechanisms (RDB and AOF), high‑availability solutions such as replication, Sentinel and Cluster, as well as performance‑monitoring techniques and common optimization strategies.

Data StructuresHigh AvailabilityPersistence
0 likes · 48 min read
Redis Core Architecture, Data Types, Persistence, High Availability, and Performance Optimization
Java Captain
Java Captain
Apr 17, 2025 · Databases

Choosing Between Sharding Middleware and NewSQL Distributed Databases: An Objective Comparison

This article objectively compares middleware‑based sharding with NewSQL distributed databases, examining their architectural differences, transaction models, high‑availability mechanisms, scaling, SQL support, storage engines, and maturity to help practitioners decide which approach best fits their workload and operational constraints.

Database ArchitectureHigh AvailabilityNewSQL
0 likes · 17 min read
Choosing Between Sharding Middleware and NewSQL Distributed Databases: An Objective Comparison
Cognitive Technology Team
Cognitive Technology Team
Apr 13, 2025 · Backend Development

Understanding RocketMQ Master‑Slave Architecture and High‑Availability Mechanisms

This article explains how RocketMQ achieves high availability and data reliability through its master‑slave broker design, covering synchronous and asynchronous replication, flush strategies, transaction messaging, automatic failover with Dledger, and read‑write separation for load balancing in distributed systems.

Data ReplicationHigh AvailabilityMaster‑Slave
0 likes · 7 min read
Understanding RocketMQ Master‑Slave Architecture and High‑Availability Mechanisms
FunTester
FunTester
Apr 12, 2025 · Operations

How to Design Effective Fault‑Testing Cases for Resilient Distributed Systems

This article explains why fault testing is essential for modern distributed and cloud environments, outlines core goals, design principles, common fault categories, practical implementation strategies such as chaos engineering and gray releases, and shows how to analyze results to continuously improve system reliability.

High AvailabilityMonitoringchaos engineering
0 likes · 18 min read
How to Design Effective Fault‑Testing Cases for Resilient Distributed Systems
Liangxu Linux
Liangxu Linux
Apr 8, 2025 · Databases

How to Build a High‑Availability Redis Cluster Without Centralized Configuration

This guide explains why Redis clustering is needed for capacity, concurrency and failover, describes Redis 3.0's decentralized cluster architecture, provides step‑by‑step commands to configure, launch and combine six nodes into a cluster, demonstrates slot calculations, client usage with Jedis, and outlines fault recovery, pros and cons, and cleanup procedures.

High AvailabilityJedisRedis
0 likes · 24 min read
How to Build a High‑Availability Redis Cluster Without Centralized Configuration
Java Backend Full-Stack
Java Backend Full-Stack
Apr 8, 2025 · Backend Development

Interview Question: Designing a Service Registry

The article walks through the need for a service registry in a micro‑service scenario, explains how services register and discover each other, discusses high‑availability deployment, and compares push, pull, and long‑polling mechanisms for dynamic detection of service instances.

High AvailabilityMicroservicesService Registry
0 likes · 10 min read
Interview Question: Designing a Service Registry
Ma Wei Says
Ma Wei Says
Apr 8, 2025 · Operations

Mastering High Availability: 4 Failover Patterns Explained

Understanding high‑availability architectures involves mastering replication and fail‑over, balancing RTO and RPO, and choosing among four patterns—Active‑Standby, Active‑Active, Cold Standby, and Hot Standby—each with distinct synchronization, load‑balancing, and cost considerations for reliable system design.

Active-ActiveHigh Availabilityactive standby
0 likes · 9 min read
Mastering High Availability: 4 Failover Patterns Explained
Alibaba Cloud Native
Alibaba Cloud Native
Apr 6, 2025 · Cloud Native

How ZEEK’s Cloud‑Native Architecture Boosted App Stability and Agility

This article details ZEEK's cloud‑native transformation, covering the strategic shift to open‑source standards, unified microservice architecture, high‑availability practices, upgraded traffic gateways, visual data analysis, car‑network data collection, and AI‑assisted development, illustrating how these steps enhanced system stability, scalability, and development efficiency.

AIAutomotiveCloud Native
0 likes · 22 min read
How ZEEK’s Cloud‑Native Architecture Boosted App Stability and Agility
The Dominant Programmer
The Dominant Programmer
Mar 22, 2025 · Databases

Master Redis Interview Questions: From Basics to Advanced, Ace Your Interview

This article compiles the most frequently asked Redis interview questions, covering fundamentals, data structures, persistence mechanisms, high‑availability features, clustering, performance tuning, and troubleshooting, providing clear explanations and practical guidance to help candidates confidently tackle any Redis interview.

Data StructuresHigh AvailabilityPerformance Optimization
0 likes · 8 min read
Master Redis Interview Questions: From Basics to Advanced, Ace Your Interview
Amap Tech
Amap Tech
Mar 21, 2025 · Mobile Development

Gaode Map Terminal Architecture: Achieving Ultra‑Stable, High‑Performance, and Efficient Mobile Mapping

Gaode Map’s new integrated container architecture, combined with on‑demand loading, package slimming, and multi‑system/device/language support, delivers ultra‑stable, high‑availability navigation with second‑level startup, halved binary size and traffic, enabling efficient, cross‑platform mobile mapping for diverse hardware.

Container ArchitectureHigh AvailabilityMobile Development
0 likes · 12 min read
Gaode Map Terminal Architecture: Achieving Ultra‑Stable, High‑Performance, and Efficient Mobile Mapping
Java Architect Essentials
Java Architect Essentials
Mar 14, 2025 · Databases

Comparing NewSQL Databases with Middleware‑Based Sharding: Advantages, Trade‑offs, and Selection Guidance

This article objectively compares NewSQL distributed databases with traditional middleware‑based sharding solutions, examining their architectures, distributed transaction handling, high‑availability, scaling, storage engines, and ecosystem maturity, and provides guidance on selecting the appropriate approach based on consistency, growth, operational capacity, and performance requirements.

Database ArchitectureHigh AvailabilityNewSQL
0 likes · 19 min read
Comparing NewSQL Databases with Middleware‑Based Sharding: Advantages, Trade‑offs, and Selection Guidance
FunTester
FunTester
Mar 14, 2025 · Operations

Fault Testing: Enhancing System Resilience through Controlled Failure Simulations

The article explains how fault testing—by deliberately injecting failures in a controlled environment—helps identify system weaknesses, validates post‑mortem improvements, and drives architectural optimization, thereby increasing high‑availability and resilience of modern internet services.

High AvailabilityOperationschaos engineering
0 likes · 8 min read
Fault Testing: Enhancing System Resilience through Controlled Failure Simulations
Top Architect
Top Architect
Mar 13, 2025 · Databases

Choosing Between NewSQL Databases and Middleware‑Based Sharding: Advantages, Trade‑offs and Practical Guidance

The article objectively compares NewSQL distributed databases with middleware‑plus‑sharding solutions, covering architectural differences, distributed transaction handling, high‑availability, scaling, SQL support, storage engines, maturity, and provides a decision‑making checklist to help engineers select the most suitable approach for their workloads.

High AvailabilityNewSQLTransaction Management
0 likes · 23 min read
Choosing Between NewSQL Databases and Middleware‑Based Sharding: Advantages, Trade‑offs and Practical Guidance
MaGe Linux Operations
MaGe Linux Operations
Mar 13, 2025 · Operations

How to Build a Secure High‑Availability Etcd Cluster on Linux

This guide walks through installing etcd, generating TLS certificates with cfssl, configuring static, dynamic, or DNS‑based discovery, setting up systemd service files for three nodes, and verifying cluster health using etcdctl, providing a complete step‑by‑step deployment for a production‑grade, cloud‑native key‑value store.

EtcdHigh Availabilitysystemd
0 likes · 19 min read
How to Build a Secure High‑Availability Etcd Cluster on Linux
Java Web Project
Java Web Project
Mar 6, 2025 · Databases

NewSQL vs Middleware Sharding: Which Architecture Truly Wins?

This article objectively compares NewSQL databases with middleware‑based sharding, dissecting their core architectures, distributed transaction handling, high‑availability designs, scaling mechanisms, SQL support, storage engines, and maturity to help engineers decide the most suitable solution for their workloads.

CAP theoremDatabase ArchitectureHigh Availability
0 likes · 20 min read
NewSQL vs Middleware Sharding: Which Architecture Truly Wins?
Code Ape Tech Column
Code Ape Tech Column
Mar 5, 2025 · Backend Development

Design and Evolution of an Enterprise Unified Push Service

The article describes the evolution from modular push modules to a framework‑based and finally a service‑oriented unified push platform, detailing its architecture, functional and non‑functional requirements, component responsibilities, and deployment considerations for high‑performance, scalable enterprise notification systems.

High AvailabilityMicroservicespush notifications
0 likes · 14 min read
Design and Evolution of an Enterprise Unified Push Service
Architecture Digest
Architecture Digest
Mar 3, 2025 · Databases

NewSQL vs Middleware Sharding: A Comparative Analysis of Distributed Databases

This article objectively compares NewSQL distributed databases with traditional middleware‑based sharding solutions, examining their architectures, distributed transaction support, high availability, scaling, SQL capabilities, and maturity to help readers decide which approach best fits their workload and operational constraints.

High AvailabilityNewSQLTransaction
0 likes · 18 min read
NewSQL vs Middleware Sharding: A Comparative Analysis of Distributed Databases
Cognitive Technology Team
Cognitive Technology Team
Feb 28, 2025 · Artificial Intelligence

Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework

This article introduces Alibaba's LangEngine, a pure Java AI application framework, detailing its high‑availability gateway architecture, communication protocols, streaming and non‑streaming output, multi‑level metadata caching, asynchronous and serverless designs, and future open‑source roadmap, offering practical guidance for building robust AI services.

AI FrameworkHigh AvailabilityLLM
0 likes · 11 min read
Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework
Sanyou's Java Diary
Sanyou's Java Diary
Feb 20, 2025 · Databases

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel provides a robust high‑availability solution by monitoring master‑slave clusters, automatically detecting failures, electing leaders, and performing failover, while using quorum voting, Pub/Sub communication, and configuration provisioning to ensure seamless master promotion and client redirection without manual intervention.

High AvailabilityRedisSentinel
0 likes · 16 min read
How Redis Sentinel Ensures Automatic Failover and High Availability
MaGe Linux Operations
MaGe Linux Operations
Jan 27, 2025 · Operations

Redis Sentinel Deep Dive: High‑Availability Architecture & Automatic Failover

This article explains Redis Sentinel’s role as the official high‑availability solution, detailing its monitoring, notification, automatic failover mechanisms, discovery processes, connection types, down‑state classifications, failover steps, leader election, master selection rules, and data consistency guarantees.

High AvailabilityMonitoringOperations
0 likes · 18 min read
Redis Sentinel Deep Dive: High‑Availability Architecture & Automatic Failover
Architect
Architect
Jan 26, 2025 · Databases

Optimizing Redis Cluster Slot Migration to Reduce Latency and Improve High Availability

This article analyzes the latency and availability problems of native Redis cluster slot migration, proposes a master‑slave synchronization based redesign that batches slot transfers, reduces ask‑move and topology‑change overhead, and validates the solution with performance tests showing smoother latency and higher reliability.

High AvailabilityRedisSlot Migration
0 likes · 16 min read
Optimizing Redis Cluster Slot Migration to Reduce Latency and Improve High Availability
Architect
Architect
Jan 23, 2025 · Operations

Designing High‑Availability Systems: Architecture, Capacity Planning, and Fault‑Tolerance Guide

This article presents a comprehensive guide to building high‑availability systems, covering availability metrics, fault prevention, detection and recovery, capacity evaluation, layered architecture design, service tiering, resilience mechanisms, and operational best practices for reliable service delivery.

High AvailabilityOperationscapacity planning
0 likes · 34 min read
Designing High‑Availability Systems: Architecture, Capacity Planning, and Fault‑Tolerance Guide
dbaplus Community
dbaplus Community
Jan 14, 2025 · Backend Development

Mastering High‑Performance, High‑Concurrency, High‑Availability Backend Systems

This article shares a backend engineer's practical methodology for building systems that simultaneously achieve high performance, high concurrency, and high availability, covering performance optimization, scaling strategies, fault‑tolerance techniques, and real‑world case studies from B‑ and C‑side logistics platforms.

CachingDDDHigh Availability
0 likes · 27 min read
Mastering High‑Performance, High‑Concurrency, High‑Availability Backend Systems
Raymond Ops
Raymond Ops
Jan 11, 2025 · Operations

How to Build a Highly Available Load Balancer with LVS and Keepalived

This tutorial explains how to design and deploy a high‑availability web cluster using Linux Virtual Server (LVS) and Keepalived, covering terminology, test environment setup, detailed configuration steps, HA testing procedures, and a concise summary of the solution.

High AvailabilityKeepalivedLVS
0 likes · 11 min read
How to Build a Highly Available Load Balancer with LVS and Keepalived
IT Architects Alliance
IT Architects Alliance
Jan 9, 2025 · Operations

Load Balancing Strategies for High Availability in Distributed Systems

This article explores the challenges and opportunities of distributed architectures and explains how various static and dynamic load‑balancing strategies, hardware and software balancers, redundancy, health checks, and failover mechanisms together ensure high availability, illustrated with real‑world e‑commerce and live‑streaming case studies and future trends.

High AvailabilityOperationsload balancing
0 likes · 20 min read
Load Balancing Strategies for High Availability in Distributed Systems
JD Tech
JD Tech
Jan 9, 2025 · Databases

Challenges and Practices of Distributed Data Systems: Master‑Slave Replication, Partitioning, and High‑Availability Strategies

This article examines the core challenges of distributed data systems—including consistency, availability, and partition tolerance—then details master‑slave replication mechanisms for MySQL and Redis, various replication modes and binlog formats, and explores data partitioning, sharding, and hot‑key mitigation techniques for scalable, high‑availability deployments.

DatabasesHigh AvailabilitySharding
0 likes · 23 min read
Challenges and Practices of Distributed Data Systems: Master‑Slave Replication, Partitioning, and High‑Availability Strategies
IT Architects Alliance
IT Architects Alliance
Jan 7, 2025 · Industry Insights

Why Multi-Active Architecture Matters and How to Build It

The article explains why multi‑active (active‑active) architecture is essential for modern enterprises, outlines its evolution from single‑server setups, details core principles like redundancy and data synchronization, compares common deployment patterns, examines industry use cases, and discusses challenges and mitigation strategies.

Cloud ComputingData ConsistencyDisaster Recovery
0 likes · 21 min read
Why Multi-Active Architecture Matters and How to Build It
Tencent Cloud Developer
Tencent Cloud Developer
Jan 7, 2025 · Operations

Designing High‑Availability Systems: Principles, Architecture, and Operations

This comprehensive guide explains how to design, build, and operate high‑availability systems by covering availability metrics, fault‑tolerance strategies, capacity planning, code and data layer architecture, automated testing, monitoring, and clear role responsibilities to ensure services stay reliable and resilient under load.

Cloud NativeHigh AvailabilitySRE
0 likes · 32 min read
Designing High‑Availability Systems: Principles, Architecture, and Operations
dbaplus Community
dbaplus Community
Jan 1, 2025 · Backend Development

Mastering Multi-Active Data Architecture: Reducing Write Latency and Ensuring High Availability

This article examines the challenges of building multi‑active distributed systems, focusing on the data layer’s role in high availability, write‑latency, sharding, isolation, replication strategies, and routing decisions, and provides concrete architectural patterns and practical guidelines for robust backend design.

Data ReplicationHigh AvailabilityLatency
0 likes · 23 min read
Mastering Multi-Active Data Architecture: Reducing Write Latency and Ensuring High Availability
IT Architects Alliance
IT Architects Alliance
Dec 29, 2024 · Operations

Design Principles and Key Technologies for High‑Availability Systems

The article explains why 24/7 high‑availability systems are essential for modern enterprises and details core design principles, layered architecture, and critical technologies such as redundancy, load balancing, caching, elastic scaling, monitoring, and fault‑tolerance to ensure continuous, reliable service.

Cloud ComputingHigh AvailabilityMonitoring
0 likes · 23 min read
Design Principles and Key Technologies for High‑Availability Systems