Tagged articles

1414 articles

Page 6 of 15

Apr 29, 2022 · Operations

Multi-Active Deployment Modes, Routing Rules, and Unitization in Data Center Architecture

The article explains multi‑active deployment modes—same‑city, cross‑city, and cross‑city multi‑active—along with routing rules (random, user‑ID, region) and unitization, describing how sharding IDs map traffic to central or unit data centers, and detailing middleware solutions for storage, messaging, SOA, and Snowflake ID generation to achieve scalable, highly available architecture.

Active-Activedata centerhigh availability

0 likes · 10 min read

Multi-Active Deployment Modes, Routing Rules, and Unitization in Data Center Architecture

Big Data Technology & Architecture

Apr 26, 2022 · Big Data

ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices

The article details ByteDance's large‑scale deployment of the Presto OLAP engine for ad‑hoc, BI, and near‑real‑time analytics, describing its architecture, multi‑coordinator high‑availability design, routing gateway, adaptive cancel, history server, materialized‑view support, Hudi connector integration, and how these innovations improve performance, stability, and operational efficiency.

Big DataHudi ConnectorMaterialized Views

0 likes · 11 min read

ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices

Architect

Apr 24, 2022 · Backend Development

Comprehensive Nginx Tutorial: Installation, Configuration, Reverse Proxy, Load Balancing, and High Availability

This guide provides a step‑by‑step tutorial on Nginx, covering its overview, single‑instance installation, reverse proxy setup, load‑balancing configuration, static‑dynamic separation, high‑availability clustering with keepalived, detailed configuration directives, and performance considerations, complete with command‑line examples and code snippets.

LinuxNginxhigh availability

0 likes · 30 min read

Comprehensive Nginx Tutorial: Installation, Configuration, Reverse Proxy, Load Balancing, and High Availability

TAL Education Technology

Apr 21, 2022 · Databases

Applying Orchestrator for High‑Availability MySQL in TAL Education Group’s Database System

This article describes how TAL Education Group evaluated, selected, and customized the open‑source Orchestrator tool to build a highly available, secure, and extensible MySQL HA solution that meets 99.99% uptime, data‑integrity, cross‑datacenter, and operational automation requirements.

Database ArchitectureOperationsOrchestrator

0 likes · 9 min read

Applying Orchestrator for High‑Availability MySQL in TAL Education Group’s Database System

Aikesheng Open Source Community

Apr 20, 2022 · Databases

Building and Using MySQL InnoDB Cluster Set (MICS) for Disaster Recovery

This article explains the components of MySQL InnoDB Cluster, introduces the InnoDB Cluster Set (MICS) for disaster‑recovery, outlines its limitations, and provides a step‑by‑step demonstration with code on how to create, monitor, and fail over a MICS deployment.

ClusterSetInnoDB ClusterReplication

0 likes · 10 min read

Building and Using MySQL InnoDB Cluster Set (MICS) for Disaster Recovery

macrozheng

Apr 14, 2022 · Operations

Mastering High Availability: 4 Essential Design Techniques for Scalable Systems

This article outlines the core high‑availability techniques—system splitting, decoupling, asynchronous processing, retry, compensation, backup, multi‑active strategies, isolation, rate limiting, circuit breaking, and degradation—providing practical guidance for designing resilient, scalable backend architectures in large‑scale internet applications.

Distributed SystemsMicroservicesSystem Design

0 likes · 13 min read

Mastering High Availability: 4 Essential Design Techniques for Scalable Systems

Liangxu Linux

Apr 12, 2022 · Operations

Mastering Nginx: Installation, Core Configuration, and High‑Availability Setup

This comprehensive guide explains Nginx's role as a high‑performance web and reverse‑proxy server, details its installation packages and step‑by‑step build process, breaks down the main nginx.conf sections, and demonstrates practical configurations for reverse proxy, load balancing, static‑dynamic separation, worker tuning, and high‑availability clustering.

LinuxNginxWeb server

0 likes · 15 min read

Mastering Nginx: Installation, Core Configuration, and High‑Availability Setup

DataFunTalk

Apr 12, 2022 · Big Data

Kuaishou Big Data Task Scheduling System: Architecture, Challenges, and Key Technologies

This article presents Kuaishou's large‑scale big‑data task scheduling system, describing its evolution from Airflow to the self‑developed Kwaiflow, the performance and reliability challenges of handling hundreds of thousands of tasks, and the design decisions that achieve low latency, high availability, and strong open capabilities.

Distributed SystemsKuaishouKwaiflow

0 likes · 22 min read

Kuaishou Big Data Task Scheduling System: Architecture, Challenges, and Key Technologies

HomeTech

Apr 6, 2022 · Databases

MySQL High Availability Architecture and Practices at AutoHome

This article explains MySQL high‑availability concepts, defines HA, RPO and RTO, outlines common HA architectures such as master‑slave+VIP, MHA and MGR+Proxy, and details AutoHome's evolution from simple master‑slave setups to a container‑based MGR solution with automated failover and monitoring platforms.

KubernetesMGRMHA

0 likes · 11 min read

MySQL High Availability Architecture and Practices at AutoHome

21CTO

Apr 3, 2022 · Backend Development

How We Achieved 20k+ TPS High Availability for a Billion‑User Membership System

This article details the design and implementation of a highly available, high‑performance membership system serving over a billion users, covering Elasticsearch dual‑center clusters, traffic isolation, Redis caching, MySQL migration, and fine‑grained flow‑control and degradation strategies.

ElasticsearchScalabilitySystem Architecture

0 likes · 21 min read

How We Achieved 20k+ TPS High Availability for a Billion‑User Membership System

Architects' Tech Alliance

Apr 2, 2022 · Industry Insights

How Financial Institutions Secure Database Continuity: Disaster Recovery Strategies & Market Trends

This article examines the critical role of databases in finance, defines disaster recovery and backup concepts, outlines industry requirements and regulations, analyzes market growth, and compares distributed database disaster‑recovery architectures such as single‑center, city‑level mutual backup, active‑active, and two‑site three‑center solutions.

BackupDistributed SystemsFinancial Services

0 likes · 15 min read

How Financial Institutions Secure Database Continuity: Disaster Recovery Strategies & Market Trends

dbaplus Community

Apr 1, 2022 · Databases

How iQIYI Built a Scalable OLTP Data Center to Eliminate Data Silos

This article details iQIYI's design and implementation of a unified OLTP data center that consolidates data across business lines, solves data‑island issues, ensures strong consistency between MongoDB and Elasticsearch, and provides high‑availability, massive‑scale storage for billions of records.

Data ArchitectureElasticsearchMongoDB

0 likes · 12 min read

How iQIYI Built a Scalable OLTP Data Center to Eliminate Data Silos

21CTO

Mar 31, 2022 · Operations

What Caused the Biggest 2021 Outages? Lessons from Bilibili, Facebook, AWS, and More

The article reviews ten major 2021 service outages—from Chinese platforms like Bilibili and Futu to global giants such as Facebook, Roblox, and AWS—analyzing their root causes, redundancy failures, and the operational lessons needed to prevent future black‑swans.

high availabilityincident responseoutage analysis

0 likes · 15 min read

What Caused the Biggest 2021 Outages? Lessons from Bilibili, Facebook, AWS, and More

Top Architect

Mar 31, 2022 · Operations

Comprehensive Nginx Tutorial: Reverse Proxy, Load Balancing, Static/Dynamic Separation, and High Availability with Keepalived

This article provides a detailed guide on using Nginx for high‑performance HTTP serving, reverse proxying, load balancing, static‑dynamic separation, installation commands, configuration file structure, practical examples with Tomcat back‑ends, and setting up high‑availability using Keepalived, complete with code snippets and diagrams.

Server Configurationhigh availabilitykeepalived

0 likes · 10 min read

Comprehensive Nginx Tutorial: Reverse Proxy, Load Balancing, Static/Dynamic Separation, and High Availability with Keepalived

Java Interview Crash Guide

Mar 31, 2022 · Backend Development

How We Achieved 20k TPS High‑Availability for a Billion‑User Membership System

This article details the design and implementation of a highly available, high‑performance membership system serving billions of users, covering Elasticsearch dual‑center clusters, traffic‑isolated architectures, deep ES optimizations, Redis caching with distributed locks, dual‑center MySQL partitioning, migration strategies, abnormal account handling, and future fine‑grained flow‑control and degradation policies.

Distributed SystemsElasticsearchScalability

0 likes · 20 min read

Senior Brother's Insights

Mar 29, 2022 · Backend Development

Dual‑Center Elasticsearch & Multi‑Cluster Redis Power 20k+ TPS for Billion‑User Membership

This article explains how a large‑scale membership system serving over a billion users achieved high performance and availability by deploying dual‑center Elasticsearch clusters, traffic‑isolated ES clusters, deep ES optimizations, a Redis caching layer with dual‑center replication, and a seamless migration from SQL Server to sharded MySQL, while also outlining future fine‑grained flow‑control and degradation strategies.

Backend Architecturehigh availabilityredis

0 likes · 20 min read

Dual‑Center Elasticsearch & Multi‑Cluster Redis Power 20k+ TPS for Billion‑User Membership

Java Interview Crash Guide

Mar 29, 2022 · Cloud Native

How to Build a Scalable Service Registry for Microservices

This article explains how to design a service registry that enables service registration, discovery, high availability, and dynamic handling of service instances in a microservice architecture, covering registration methods, consumer/provider interaction, push/pull mechanisms, long‑polling, and heartbeat health checks.

Microserviceshigh availabilitylong polling

0 likes · 8 min read

How to Build a Scalable Service Registry for Microservices

Alibaba Cloud Developer

Mar 28, 2022 · Operations

How to Implement Robust Rate Limiting with Alibaba Cloud AHAS

This guide explains how to use Alibaba Cloud's Application High Availability Service (AHAS) to monitor QPS, define granular rate‑limiting rules, prevent abuse, isolate upstream failures, and protect both HTTP and non‑HTTP workloads in microservice architectures.

AHASAlibaba CloudMicroservices

0 likes · 10 min read

How to Implement Robust Rate Limiting with Alibaba Cloud AHAS

Beike Product & Technology

Mar 25, 2022 · Backend Development

Audio Stream Gateway Architecture: Design, Evolution, and High Availability

This article introduces the architecture, evolution, and high availability construction of the audio stream gateway, covering audio basics, system design, and practical applications in real-time audio processing.

Audio StreamingBackend Architectureaudio stream gateway

0 likes · 12 min read

Audio Stream Gateway Architecture: Design, Evolution, and High Availability

Programmer DD

Mar 24, 2022 · Operations

Master Nginx, Keepalived, and LVS: Build a High‑Availability Load‑Balancing Cluster

This guide walks you through installing Nginx from source, configuring reverse‑proxy and various load‑balancing methods, setting up Keepalived for high‑availability, and integrating LVS (DR, NAT, TUN modes) to create a robust, fault‑tolerant web service architecture on Linux.

LVSNginxhigh availability

0 likes · 24 min read

Master Nginx, Keepalived, and LVS: Build a High‑Availability Load‑Balancing Cluster

IT Architects Alliance

Mar 23, 2022 · Operations

Designing High‑Performance, Highly‑Available, Scalable E‑Commerce Architecture

This article provides a comprehensive technical guide on building large‑scale distributed websites, covering characteristics, architectural goals, patterns, performance, high‑availability, scalability, extensibility, security, agility, and a detailed e‑commerce case study with practical diagrams and capacity estimations.

Distributed SystemsScalabilitycaching

0 likes · 26 min read

Designing High‑Performance, Highly‑Available, Scalable E‑Commerce Architecture

Top Architect

Mar 20, 2022 · Backend Development

High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, MySQL Migration, and Flow‑Control Strategies

The article details a comprehensive high‑availability solution for a large‑scale membership system, covering Elasticsearch dual‑center master‑slave clusters, traffic‑isolated three‑cluster designs, deep ES optimizations, Redis caching with consistency safeguards, MySQL partitioned migration, and fine‑grained flow‑control and degradation mechanisms.

ElasticsearchFlow Controlhigh availability

0 likes · 19 min read

High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, MySQL Migration, and Flow‑Control Strategies

Tencent Cloud Developer

Mar 18, 2022 · Industry Insights

How Tencent Cloud WeDa Low‑Code Platform Enables Secure, Scalable Enterprise Apps

This article provides an in‑depth analysis of Tencent Cloud WeDa low‑code platform, covering its definition, evolution, market adoption, core capabilities, architecture, backend practices, development workflow, high‑availability design, and future trends, while explaining why low‑code boosts efficiency and digital transformation.

Digital TransformationIndustry Analysiscloud computing

0 likes · 19 min read

How Tencent Cloud WeDa Low‑Code Platform Enables Secure, Scalable Enterprise Apps

Architecture Digest

Mar 18, 2022 · Backend Development

High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, and MySQL Migration

This article details the design and implementation of a high‑performance, highly available membership system, covering Elasticsearch dual‑center master‑slave clusters, traffic‑isolated three‑cluster ES architecture, Redis cache strategies, MySQL dual‑center partitioning, seamless migration, abnormal member handling, and fine‑grained flow‑control and degradation policies.

ElasticsearchFlow ControlSystem Architecture

0 likes · 20 min read

High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, and MySQL Migration

Tencent Architect

Mar 16, 2022 · Cloud Computing

Tencent RegionEIP: High‑Performance Networking with X86 & P4

This article explains how Tencent's RegionEIP combines X86‑based load distributors and P4 programmable switches to deliver high‑performance, highly available public network access for cloud services, detailing zone disaster recovery, traffic‑shaping algorithms, four‑level routing priority and port‑redundancy designs.

P4Tencent Cloudcloud networking

0 likes · 14 min read

Tencent RegionEIP: High‑Performance Networking with X86 & P4

IT Architects Alliance

Mar 15, 2022 · Big Data

Understanding Kafka Replication: Mechanism, Roles, ISR, and Unclean Leader Election

This article explains Apache Kafka's replication mechanism, detailing its benefits, replica definitions, leader‑follower roles, in‑sync replica (ISR) criteria, and the trade‑offs of unclean leader election, highlighting how these features affect data redundancy, scalability, and consistency in distributed systems.

ISRKafkaReplication

0 likes · 11 min read

Understanding Kafka Replication: Mechanism, Roles, ISR, and Unclean Leader Election

Cloud Native Technology Community

Mar 15, 2022 · Databases

How to Build a High‑Availability MySQL PXC Cluster: Installation & Features

This guide explains the Percona XtraDB Cluster (PXC) architecture, its advantages and limitations, and provides step‑by‑step commands for removing MariaDB, opening firewall ports, disabling SELinux, downloading packages, configuring MySQL, bootstrapping the first node, adding additional nodes, and verifying the cluster status.

ClusterInstallationPXC

0 likes · 8 min read

How to Build a High‑Availability MySQL PXC Cluster: Installation & Features

Alipay Experience Technology

Mar 14, 2022 · Backend Development

Transforming BFF Development: Ant Group’s Needle Node FaaS Platform Explained

This article details Ant Group’s Needle platform, a progressive Node‑based FaaS solution that tackles BFF development challenges through function‑level isolation, dynamic deployment, high‑availability design, and a roadmap for future performance and cost optimizations.

FaaSNode.jsbackend-development

0 likes · 11 min read

Transforming BFF Development: Ant Group’s Needle Node FaaS Platform Explained

21CTO

Mar 13, 2022 · Backend Development

How Meituan Built a Fault‑Tolerant Instant Logistics Platform at Scale

Meituan’s instant logistics platform evolved from vertical services to a micro‑service, distributed architecture that handles massive order‑rider matching, ultra‑low latency, and high availability, leveraging AI for pricing, ETA, scheduling, and employing robust scaling, consistency, and disaster‑recovery techniques.

AIDistributed SystemsLogistics

0 likes · 10 min read

How Meituan Built a Fault‑Tolerant Instant Logistics Platform at Scale

AntTech

Mar 12, 2022 · Operations

Evolution of Large‑Scale Distributed System Stability at Ant Group

The article outlines Ant Group's multi‑stage journey of building large‑scale distributed system stability, describing architectural evolutions, risk‑inspection mechanisms, high‑availability solutions such as LDC and fine‑grained traffic scheduling, and intelligent risk‑defense products that together enable resilient, cost‑effective operations.

Cloud NativeDistributed SystemsOperations

0 likes · 15 min read

Evolution of Large‑Scale Distributed System Stability at Ant Group

IT Services Circle

Mar 12, 2022 · Databases

Understanding MySQL High Availability, Master‑Slave Replication Delay, and Switch Strategies

This article explains MySQL high availability concepts, how master‑slave replication works, how to measure and mitigate replication lag, and compares reliable‑first and availability‑first failover strategies with practical experiments on binlog formats.

BinlogMaster‑Slave DelayReplication

0 likes · 8 min read

Understanding MySQL High Availability, Master‑Slave Replication Delay, and Switch Strategies

Efficient Ops

Mar 6, 2022 · Operations

Mastering Redis Sentinel: Build High‑Availability Clusters Step‑by‑Step

This article explains Redis Sentinel’s role in achieving high availability, details its core functions, underlying Raft‑based algorithm, configuration parameters, practical setup steps, fault‑tolerance mechanisms, quorum and majority calculations, and demonstrates failover and recovery scenarios with real command‑line examples.

failoverhigh availabilityredis

0 likes · 20 min read

Mastering Redis Sentinel: Build High‑Availability Clusters Step‑by‑Step

Kuaishou Big Data

Mar 3, 2022 · Big Data

How Kwai’s OneService Platform Revolutionizes Data Service Development

The article details Kwai’s OneService platform—a low‑code, self‑service data platform that streamlines API creation, deployment, and operation, covering its background, architecture, key technologies such as API matrix and configuration‑as‑code, high‑availability and performance strategies, achieved results, and future roadmap.

API ServiceData Platformhigh availability

0 likes · 20 min read

How Kwai’s OneService Platform Revolutionizes Data Service Development

dbaplus Community

Mar 1, 2022 · Databases

MHA Re-Edition: Modern MySQL HA with GTID Failover and Auto Switch

The MHA Re-Edition tool revives the discontinued MHA manager for MySQL, adding GTID‑based failover, password‑only SSH authentication, lightweight binaries, VIP migration, WeChat alerts, remote‑card reboot, and detailed configuration options, with step‑by‑step deployment instructions and sample app1.cnf parameters for high‑availability clusters.

GTIDMHAdatabase

0 likes · 11 min read

MHA Re-Edition: Modern MySQL HA with GTID Failover and Auto Switch

Efficient Ops

Feb 23, 2022 · Operations

Why a Single Kafka Broker Crash Can Halt All Consumers – The HA Explained

An in‑depth look at Kafka’s high‑availability architecture reveals how multi‑replica redundancy, ISR mechanisms, and the configuration of the __consumer_offset topic interact, explaining why a single broker failure can render the entire cluster unusable and how to properly configure replication and ack settings to prevent it.

ACKConsumer OffsetISR

0 likes · 10 min read

Why a Single Kafka Broker Crash Can Halt All Consumers – The HA Explained

Architect's Journey

Feb 23, 2022 · Backend Development

What Kind of Company Is This? Inside Our B2B Vertical E‑Commerce Business & Tech Stack

The article explains the company's B2B vertical e‑commerce model, outlines its core service modules, compares B2B and C2C commerce, and details the Java‑based microservice architecture, high‑availability design, and ongoing recruitment efforts.

B2B e-commerceMicroservicesSpring Cloud Alibaba

0 likes · 13 min read

What Kind of Company Is This? Inside Our B2B Vertical E‑Commerce Business & Tech Stack

IT Architects Alliance

Feb 22, 2022 · Backend Development

Designing a Scalable, High‑Performance Distributed E‑Commerce Architecture: A Technical Guide

This article provides a comprehensive technical overview of large‑scale distributed website architecture, covering characteristics, goals, patterns, high‑performance, high‑availability, scalability, extensibility, security, and agility considerations, and walks through the evolution of an e‑commerce system with concrete examples and diagrams.

Distributed SystemsMicroservicesScalability

0 likes · 26 min read

Designing a Scalable, High‑Performance Distributed E‑Commerce Architecture: A Technical Guide

Tencent Cloud Developer

Feb 21, 2022 · Backend Development

Design and Engineering Practices of a Billion‑Scale Node.js Gateway

Wang Weijia’s talk outlines the architecture and engineering of Tencent CloudBase’s billion‑scale Node.js gateway—built with Nest.js, layered controllers and services, async streaming, keep‑alive connections, a two‑level cache with refresh‑ahead, and HA measures like horizontal scaling, rate limiting, multi‑AZ deployment, and disaster‑recovery caching—delivering 99.98% cache hits, 14 ms median latency, and proving Node.js can power latency‑sensitive services while encouraging front‑end engineers to adopt backend practices.

Cloud NativeNode.jsgateway architecture

0 likes · 33 min read

Design and Engineering Practices of a Billion‑Scale Node.js Gateway

Ops Development Stories

Feb 17, 2022 · Cloud Native

How to Build a Minimal‑Cost HA Harbor Registry with PostgreSQL Replication on Alibaba Cloud

This guide details a low‑overhead, highly available Harbor deployment on Alibaba Cloud, covering preparation of SLB, ECS, NFS storage, installation of Docker‑Compose, configuration of image mirrors, installation of Harbor 2.3, setup of PostgreSQL 13 master‑slave replication, Redis integration, backup procedures, failover handling, and disaster‑recovery strategies.

Alibaba CloudDockerHarbor

0 likes · 20 min read

How to Build a Minimal‑Cost HA Harbor Registry with PostgreSQL Replication on Alibaba Cloud

MaGe Linux Operations

Feb 11, 2022 · Operations

Master Nginx, Keepalived, and LVS: Build a High‑Availability Load‑Balancing Cluster

This guide walks through installing Nginx on Linux, configuring reverse‑proxy and various load‑balancing methods, setting up SSL, integrating Keepalived for high‑availability, and deploying LVS (DR mode) with ipvsadm to create a robust, fault‑tolerant web‑service cluster.

LVSNginxhigh availability

0 likes · 25 min read

Alibaba Cloud Native

Feb 10, 2022 · Cloud Native

How Multi-Active Architecture Can Eliminate Downtime: Inside Alibaba Cloud’s AppActive

Despite widespread cloud adoption, large‑scale outages still occur, prompting Alibaba Cloud’s high‑availability team to share the evolution, principles, and open‑source implementation of multi‑active disaster recovery (AppActive) that aims to achieve minute‑level failover and near‑zero downtime.

Alibaba CloudAppActivedisaster recovery

0 likes · 11 min read

How Multi-Active Architecture Can Eliminate Downtime: Inside Alibaba Cloud’s AppActive

Laravel Tech Community

Feb 8, 2022 · Operations

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability

This article provides a step‑by‑step tutorial on installing Nginx, setting up reverse‑proxy and various load‑balancing methods, configuring SSL, deploying Keepalived for failover, and building an LVS‑DR high‑availability cluster with detailed command examples and configuration snippets.

LVSNginxSSL

0 likes · 20 min read

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability

Code Ape Tech Column

Feb 7, 2022 · Databases

Redis Interview Q&A: Thread Model, Persistence, High Availability, and Cluster Mechanisms

This article presents a simulated interview that explains Redis's evolution from single‑threaded to optional multithreading, its in‑memory performance advantages, persistence options (AOF, RDB, hybrid), high‑availability architectures, and the consistent‑hashing based slot allocation used by Redis Cluster.

AOFClusterPersistence

0 likes · 9 min read

Redis Interview Q&A: Thread Model, Persistence, High Availability, and Cluster Mechanisms

Selected Java Interview Questions

Feb 5, 2022 · Backend Development

Message Queue Fundamentals: Use Cases, Product Comparison, High Availability, and Reliability Strategies

This article explains why message queues are used, outlines common scenarios such as decoupling, asynchronous processing and traffic shaping, compares major MQ products, and provides practical guidance on high availability, preventing loss, duplicate consumption, ordering, backlog handling, and expiration.

KafkaMessage QueueRabbitMQ

0 likes · 8 min read

Message Queue Fundamentals: Use Cases, Product Comparison, High Availability, and Reliability Strategies

IT Architects Alliance

Feb 2, 2022 · Fundamentals

Strategic and Tactical Design Principles for Technical Architecture

This article explains how technical architecture transforms product requirements into implementation by addressing layering, language choices, and non‑functional concerns, and introduces strategic principles of suitability, simplicity, and evolution along with tactical guidelines for high concurrency, high availability, and business design.

System Designdesign principleshigh availability

0 likes · 14 min read

Strategic and Tactical Design Principles for Technical Architecture

IT Architects Alliance

Jan 27, 2022 · Operations

How to Build a Highly Available Redis Service with Sentinel – Step‑by‑Step Guide

This article explains why Redis needs high availability, defines failure scenarios, compares common HA solutions, and walks through four deployment patterns—from a single instance to a three‑Sentinel architecture—highlighting their trade‑offs and practical implementation details.

Service Architecturebackend operationshigh availability

0 likes · 13 min read

How to Build a Highly Available Redis Service with Sentinel – Step‑by‑Step Guide

Architect

Jan 25, 2022 · Databases

Designing a High‑Availability Redis Service with Sentinel

This article explains why Redis needs high availability, defines failure scenarios, compares several HA architectures—including single‑instance, master‑slave with one or multiple Sentinel processes, and a three‑node solution with a virtual IP—and provides practical guidance for building a reliable Redis service.

Operationshigh availabilityredis

0 likes · 12 min read

Designing a High‑Availability Redis Service with Sentinel

Top Architect

Jan 25, 2022 · Big Data

Elasticsearch Cluster Deployment and Management Guide (Mac/Windows)

This article explains why Elasticsearch should run in a cluster, describes the cluster concept, provides step‑by‑step configuration for three nodes on macOS/Windows, demonstrates health checks, failover, horizontal scaling, routing calculations, shard control, and the read/write workflow, all illustrated with code snippets and screenshots.

ClusterElasticsearchhigh availability

0 likes · 10 min read

Elasticsearch Cluster Deployment and Management Guide (Mac/Windows)

Architecture Digest

Jan 23, 2022 · Operations

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability

This tutorial walks through installing Nginx, setting up upstream reverse‑proxy rules, configuring various load‑balancing algorithms, enabling SSL, deploying Keepalived for failover, and building an LVS‑DR high‑availability cluster with detailed commands and configuration examples.

LVSSSLhigh availability

0 likes · 24 min read

IT Architects Alliance

Jan 20, 2022 · Cloud Native

How to Build a High‑Availability Microservices System on Kubernetes – A Complete Guide

This guide walks you through designing a simple front‑back separation microservice architecture, implementing it with Java Spring Boot, deploying multiple instances with Eureka, adding Prometheus‑Grafana monitoring, logging, tracing, flow control, and finally installing Kubernetes using K8seasy and verifying high‑availability across the cluster.

Cloud NativeKubernetesMicroservices

0 likes · 19 min read

How to Build a High‑Availability Microservices System on Kubernetes – A Complete Guide

Alibaba Cloud Native

Jan 19, 2022 · Cloud Computing

How to Achieve Service Discovery High Availability with Push‑Empty Protection in MSE

This article walks through a real‑world Kubernetes outage caused by DNS and Nacos client bugs, explains the chain of failures, and presents a failure‑oriented design that adds push‑empty protection and outlier removal using Alibaba Cloud MSE to keep microservices highly available.

KubernetesMSEMicroservices

0 likes · 17 min read

How to Achieve Service Discovery High Availability with Push‑Empty Protection in MSE

Tencent Database Technology

Jan 19, 2022 · Databases

Deep Dive into Tencent's Self‑Developed MySQL Kernel TXSQL and Its Architecture

This article provides a comprehensive overview of Tencent's self‑developed MySQL kernel TXSQL, covering its evolution, overall architecture, columnar storage engine, instant DDL capabilities, enterprise‑grade features, high‑availability mechanisms, performance optimizations, and the rigorous development and testing processes behind the product.

Columnar StorageTXSQLcloud database

0 likes · 11 min read

Deep Dive into Tencent's Self‑Developed MySQL Kernel TXSQL and Its Architecture

DeWu Technology

Jan 19, 2022 · Operations

Common High‑Availability Architecture Patterns and Multi‑Active Deployment Strategies

Covering essential high‑availability techniques, the article examines disaster‑recovery architectures from same‑city dual‑center to cross‑country active‑passive deployments, compares five patterns, details three multi‑active models, outlines required traffic‑scheduling, replication, and database layers, and provides design methodology, practical safeguards, and key HA metrics.

Distributed Systemsdata replicationdisaster recovery

0 likes · 23 min read

Common High‑Availability Architecture Patterns and Multi‑Active Deployment Strategies

Top Architect

Jan 15, 2022 · Backend Development

Technical Architecture Design Principles: Strategy and Tactics for Backend Systems

This article explains how to design robust backend technical architectures by addressing strategic principles such as suitability, simplicity, and evolution, and tactical guidelines covering high concurrency, high availability, and business design, while illustrating logical and physical architecture diagrams and practical implementation tips.

Software ArchitectureSystem DesignTechnical architecture

0 likes · 14 min read

Technical Architecture Design Principles: Strategy and Tactics for Backend Systems

IT Architects Alliance

Jan 15, 2022 · Backend Development

How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google and Real‑World Experiments

This article explains the fundamentals of RabbitMQ, compares normal and mirrored cluster modes, details Google’s large‑scale test setup, and walks through advanced plugins such as sharding, consistent‑hash exchange, federation, and high‑availability strategies for achieving million‑level message throughput.

BackendMessage QueueRabbitMQ

0 likes · 24 min read

How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google and Real‑World Experiments

Architecture Digest

Jan 13, 2022 · Backend Development

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High‑Availability Practices

This article explains how to horizontally scale RabbitMQ clusters to handle millions of messages per second by leveraging cluster modes, mirror queues, sharding plugins, consistent‑hash exchanges, federation, and high‑availability configurations, while also covering practical scenarios such as retries, delayed tasks, and Spring AMQP integration.

FederationMessage QueueRabbitMQ

0 likes · 22 min read

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High‑Availability Practices

IT Xianyu

Jan 9, 2022 · Operations

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, LVS, and High‑Availability Clusters

This tutorial walks through installing Nginx from source, setting up upstream reverse‑proxy groups, configuring various load‑balancing methods (weight, IP hash, URL hash, least connections), enabling SSL, deploying Keepalived for failover, and building an LVS‑DR high‑availability cluster with detailed command‑line examples.

LVSNginxSSL

0 likes · 23 min read

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, LVS, and High‑Availability Clusters

Top Architect

Jan 8, 2022 · Operations

High‑Availability Architecture Practices from Bilibili: Load Balancing, Rate Limiting, Retries, and Timeout Strategies

This article presents Bilibili’s high‑availability design, covering load‑balancing decisions, subset selection, multi‑cluster deployment, adaptive rate limiting, retry policies, timeout propagation, and chain‑failure mitigation, all illustrated with diagrams and practical SRE insights.

BackendRetrySRE

0 likes · 15 min read

High‑Availability Architecture Practices from Bilibili: Load Balancing, Rate Limiting, Retries, and Timeout Strategies

Architects Research Society

Jan 7, 2022 · Databases

High‑Availability Clustering Solutions for PostgreSQL

This article explains the concepts of high availability, continuous recovery, and standby databases, then reviews various PostgreSQL clustering options such as DRBD, ClusterControl, Rubyrep, Pgpool‑II, Bucardo, Postgres‑XC, Citus, and PostgresXL, highlighting their features, advantages, and drawbacks.

ClusterControlDRBDDatabase Replication

0 likes · 16 min read

High‑Availability Clustering Solutions for PostgreSQL

Alibaba Cloud Native

Jan 6, 2022 · Cloud Native

How to Secure High‑Availability Traffic with AHAS Ingress on Kubernetes

This guide explains the AHAS Application High Availability Service, its traffic‑funnel protection principles, and step‑by‑step configuration of Ingress/Nginx traffic control in an Alibaba Cloud ACK cluster, including request grouping, flow‑control rules, and performance testing.

high availabilitytraffic control

0 likes · 7 min read

How to Secure High‑Availability Traffic with AHAS Ingress on Kubernetes

Ctrip Technology

Jan 6, 2022 · Cloud Native

High‑Availability Architecture and Performance Optimizations for Service Mesh at Ctrip

This article describes Ctrip's cloud‑native Service Mesh deployment, detailing its multi‑IDC high‑availability design, fault‑scenario analysis, xDS push metrics, event‑handling optimizations, cold‑start improvements, and progressive canary release strategies to ensure reliable, scalable service traffic management.

Cloud NativeService Meshcanary release

0 likes · 16 min read

High‑Availability Architecture and Performance Optimizations for Service Mesh at Ctrip

Open Source Linux

Jan 6, 2022 · Operations

Disaster Recovery Explained: Definitions, Strategies, and Implementation

This article provides a comprehensive guide to disaster recovery, covering its definition, the distinction between backup and DR, various protection strategies, measurement metrics such as RPO and RTO, and practical implementation methods across storage, cloud, and network layers.

BackupData ProtectionRPO

0 likes · 16 min read

Disaster Recovery Explained: Definitions, Strategies, and Implementation

Su San Talks Tech

Jan 5, 2022 · Operations

Why Simple Load Balancing Fails and How to Build a Scalable Multi‑Layer Architecture

This article walks through the evolution from a single‑server Tomcat setup to a multi‑layer architecture using Nginx, a gateway, and LVS, explaining dynamic/static request separation, high‑availability strategies, and the performance trade‑offs that guide scalable backend design.

Backend ArchitectureLVSNginx

0 likes · 11 min read

Why Simple Load Balancing Fails and How to Build a Scalable Multi‑Layer Architecture

Architects' Tech Alliance

Jan 1, 2022 · Operations

Disaster Recovery (DR) Fundamentals: Definitions, Roles, Metrics, and Implementation

This article provides a comprehensive overview of disaster recovery, covering its definition, the distinction between backup and DR, their respective roles, key metrics such as RPO and RTO, various replication technologies, and practical implementation methods across storage, network, and host layers.

BackupRPORTO

0 likes · 20 min read

Disaster Recovery (DR) Fundamentals: Definitions, Roles, Metrics, and Implementation

Architects Research Society

Jan 1, 2022 · Cloud Native

Running Kubernetes Across Multiple Failure Zones

This article explains how Kubernetes clusters can be deployed across multiple failure zones and regions, detailing control plane replication, node labeling, pod topology constraints, storage zone awareness, network considerations, and disaster recovery strategies to achieve high availability in cloud‑native environments.

Cloud NativeCluster DesignKubernetes

0 likes · 8 min read

Running Kubernetes Across Multiple Failure Zones

Tencent Architect

Dec 30, 2021 · Databases

Practices and Exploration of Disaster Recovery in Tencent Cloud‑Native Database TDSQL‑C (formerly CynosDB)

This article examines the architecture differences between cloud‑native TDSQL‑C and traditional MySQL, outlines TDSQL‑C’s elastic, serverless, low‑latency features, compares MySQL disaster‑recovery models, and details the multi‑dimensional disaster‑recovery system and its cross‑AZ/Region challenges and solutions.

TDSQL-Ccloud-native databasedisaster recovery

0 likes · 9 min read

Practices and Exploration of Disaster Recovery in Tencent Cloud‑Native Database TDSQL‑C (formerly CynosDB)

Architecture Digest

Dec 28, 2021 · Big Data

HDFS Overview: Architecture, Features, Data Management and Storage Policies

This article provides a comprehensive overview of HDFS, covering basic file system concepts, HDFS architecture, high availability, federation, replica placement, storage policies, colocation, data integrity, and key design considerations for large‑scale distributed storage.

Big DataColocationDistributed File System

0 likes · 23 min read

HDFS Overview: Architecture, Features, Data Management and Storage Policies

IT Architects Alliance

Dec 24, 2021 · Industry Insights

Why Open Source Is the Future of DevOps Platforms: Key Strategies for Cloud‑Native Success

The article outlines strategic thinking for DevOps product planning, emphasizing open‑source logic, automation installation, high‑availability design, hybrid‑cloud bridge gateways, and the shift from raw resources to declarative service consumption in cloud‑native environments.

Cloud Nativeautomationhigh availability

0 likes · 10 min read

Why Open Source Is the Future of DevOps Platforms: Key Strategies for Cloud‑Native Success

Alibaba Cloud Native

Dec 23, 2021 · Cloud Native

Designing High‑Availability for Microservices: Service Discovery & Config Management Best Practices

This article walks through a real‑world microservice outage, analyzes the risk chain, presents four high‑availability strategies, details service‑discovery and configuration‑management HA designs, and provides a step‑by‑step Kubernetes demo with code, monitoring, fault injection and results.

Configuration ManagementMicroserviceshigh availability

0 likes · 20 min read

Designing High‑Availability for Microservices: Service Discovery & Config Management Best Practices

High Availability Architecture

Dec 23, 2021 · Fundamentals

Master Data Management Architecture and Practices for Baidu Smart Mini Programs

This article presents a comprehensive overview of master data management concepts, maturity levels, and the challenges faced by Baidu smart mini‑programs, followed by a detailed practical architecture design—including domain modeling, high‑availability microservice implementation, performance optimization, and data synchronization—while also discussing future extensions and team capability building.

Baidu Mini ProgramsData ArchitectureMaster Data Management

0 likes · 14 min read

Master Data Management Architecture and Practices for Baidu Smart Mini Programs

Top Architect

Dec 22, 2021 · Operations

Load Balancing: Principles, Types, and Algorithms

This article explains the fundamentals of load balancing, covering its purpose, vertical and horizontal scaling, various classifications such as DNS, IP, link‑layer and hybrid methods, common algorithms like round‑robin and weighted, as well as hardware solutions, providing a comprehensive guide for building scalable, high‑availability systems.

AlgorithmsDistributed Systemshigh availability

0 likes · 13 min read

Load Balancing: Principles, Types, and Algorithms

IT Architects Alliance

Dec 22, 2021 · Industry Insights

Mastering Technical Architecture: Strategic & Tactical Design Principles for Scalable Systems

This article explains how to transform product requirements into robust technical architectures by addressing uncertainty through strategic principles—suitability, simplicity, evolution—and tactical guidelines covering high concurrency, high availability, and business design, illustrated with logical and physical diagrams.

ScalabilitySoftware Architecturedesign principles

0 likes · 14 min read

Mastering Technical Architecture: Strategic & Tactical Design Principles for Scalable Systems

21CTO

Dec 20, 2021 · Fundamentals

Mastering Software Architecture: Strategic & Tactical Design Principles

This article explores how to transform product requirements into robust technical architectures by addressing uncertainty, outlining strategic principles—appropriateness, simplicity, evolution—and tactical guidelines for high concurrency, high availability, and business design, while illustrating logical and physical architecture diagrams.

Software ArchitectureSystem Designdesign principles

0 likes · 14 min read

Mastering Software Architecture: Strategic & Tactical Design Principles

Python Programming Learning Circle

Dec 20, 2021 · Operations

Guide to Installing and Configuring Keepalived for High Availability Using VRRP

This tutorial explains how to achieve high availability with keepalived by installing the software, configuring VRRP virtual IPs, setting up master and backup nodes, starting the service, and verifying failover through VIP testing on Linux systems.

LinuxSystem AdministrationVRRP

0 likes · 6 min read

Guide to Installing and Configuring Keepalived for High Availability Using VRRP

HomeTech

Dec 14, 2021 · Databases

TiDB Cross-Data-Center High Availability Using Binlog Bidirectional Replication

This article summarizes the design, working principle, deployment steps, testing results, and future outlook of a TiDB cross-data-center high‑availability solution based on Binlog bidirectional replication, aiming to ensure rapid failover and continuous service between two data‑center clusters.

Bidirectional ReplicationBinlogCross‑Data‑Center

0 likes · 5 min read

TiDB Cross-Data-Center High Availability Using Binlog Bidirectional Replication

NetEase Smart Enterprise Tech+

Dec 14, 2021 · Backend Development

How NetEase Cloud’s Distributed Recording Cluster Ensures High‑Availability and Scalability

This article explains the architecture and key features of NetEase Cloud's local server‑side recording cluster, detailing how dynamic scaling, multi‑backup high availability, load‑balancing strategies, monitoring, and an embedded registration center enable secure, reliable, and scalable recording for data‑sensitive applications.

Distributed SystemsJava SDKREST API

0 likes · 11 min read

How NetEase Cloud’s Distributed Recording Cluster Ensures High‑Availability and Scalability

IT Architects Alliance

Dec 13, 2021 · Operations

Mastering HAProxy: Installation, L7/L4 Load Balancing, and High‑Availability Setup

This comprehensive guide explains what HAProxy is, its core capabilities and performance characteristics, walks through installing and configuring it on CentOS 7 for both L7 and L4 load‑balancing scenarios, and shows how to achieve high availability using Keepalived, complete with practical code snippets and sysctl tuning.

HAProxyL4L7

0 likes · 29 min read

Mastering HAProxy: Installation, L7/L4 Load Balancing, and High‑Availability Setup

IT Architects Alliance

Dec 11, 2021 · Databases

Mastering Redis Replication and Sentinel: Solving Failover Challenges

This article examines the limitations of Redis master‑slave replication, explains how Redis Sentinel addresses those issues with monitoring, notification, and automatic failover, and provides detailed configuration commands, discovery mechanisms, and step‑by‑step failover procedures for building a highly available Redis deployment.

ConfigurationReplicationdatabase

0 likes · 12 min read

Mastering Redis Replication and Sentinel: Solving Failover Challenges

Top Architect

Dec 10, 2021 · Operations

Comprehensive Guide to Load Balancing: Principles, Types, Algorithms, and Hardware

This article explains the fundamentals of load balancing, covering why it is needed for high‑traffic services, the difference between vertical and horizontal scaling, various load‑balancing techniques (DNS, HTTP, IP, link‑layer, hybrid), common algorithms, and the trade‑offs of software versus hardware solutions.

Distributed SystemsNetworkingOperations

0 likes · 13 min read

Comprehensive Guide to Load Balancing: Principles, Types, Algorithms, and Hardware

Qingyun Technology Community

Dec 7, 2021 · Databases

Master PostgreSQL Replication with repmgr: A Complete Guide

This article introduces repmgr, an open‑source PostgreSQL replication manager, covering its architecture, election mechanism, core tools, metadata tables, installation steps, command syntax, configuration options, and common operations for building high‑availability database clusters.

Replicationdatabase clusteringhigh availability

0 likes · 8 min read

Master PostgreSQL Replication with repmgr: A Complete Guide

NiuNiu MaTe

Dec 7, 2021 · Databases

Master‑Slave Replication in Redis: How It Works and How to Prevent Data Loss

This article explains why a single‑instance Redis can cause outages, introduces the master‑slave architecture, details the full and incremental synchronization processes, shows how to configure replication, addresses multi‑slave scaling, network interruptions, and automatic failover with Sentinel.

Master‑SlaveReplicationdatabase

0 likes · 11 min read

Master‑Slave Replication in Redis: How It Works and How to Prevent Data Loss

Practical DevOps Architecture

Dec 5, 2021 · Databases

Deploying MHA for MySQL High Availability – Part 1

This guide walks through the step‑by‑step deployment of MHA on a MySQL cluster, covering package installation on all nodes, copying and installing the MHA RPMs, creating the required MySQL user, configuring MHA, testing SSH connectivity, and reviewing the failover script.

LinuxMHAdatabase

0 likes · 6 min read

Deploying MHA for MySQL High Availability – Part 1

MaGe Linux Operations

Dec 1, 2021 · Operations

Scalable High‑Availability Prometheus: Small‑Scale to Massive Deployments

This article explains how Prometheus’s local storage limits scalability and how Remote Storage, federation, and high‑availability setups—using dual instances, keepalived, and adapters with PostgreSQL + TimescaleDB—can overcome data persistence and performance challenges for both small‑scale and large‑scale monitoring environments.

FederationPrometheusRemote Storage

0 likes · 5 min read

Scalable High‑Availability Prometheus: Small‑Scale to Massive Deployments

Qunar Tech Salon

Dec 1, 2021 · Databases

PostgreSQL High Availability (PGHA) at Qunar: Architecture, Customization, Testing, and Metrics

This article details Qunar's implementation of PostgreSQL high‑availability using Patroni, covering solution selection, custom DCS and failover mechanisms, operational impact, comprehensive testing procedures, performance metrics, and future directions for cross‑region HA deployment.

Database operationsHA TestingMetrics

0 likes · 11 min read

PostgreSQL High Availability (PGHA) at Qunar: Architecture, Customization, Testing, and Metrics

Software Development Quality

Nov 29, 2021 · Backend Development

Designing Scalable, High‑Performance Architecture for Large‑Scale Websites

Large‑scale website architecture must balance massive user traffic, data volume, security threats, and rapid feature changes by adopting layered, distributed designs that emphasize high performance, high availability, scalability, extensibility, and agility, employing techniques such as caching, load balancing, clustering, sharding, and service‑oriented components.

MicroservicesScalabilitycaching

0 likes · 22 min read

Designing Scalable, High‑Performance Architecture for Large‑Scale Websites

IT Architects Alliance

Nov 26, 2021 · Operations

Large-Scale Distributed Website Architecture: Principles, Patterns, and Practices

This article provides a comprehensive technical summary of large‑scale distributed website architecture, covering characteristics, goals, architectural patterns, performance, high‑availability, scalability, extensibility, security, agility, and a detailed evolution roadmap with practical examples and recommendations.

Distributed SystemsScalabilityarchitecture

0 likes · 22 min read

Large-Scale Distributed Website Architecture: Principles, Patterns, and Practices

IT Architects Alliance

Nov 24, 2021 · Operations

Designing High‑Availability, High‑Performance, Scalable and Secure Architecture for Large Web Applications

This article explains how to evolve a large‑scale website architecture through stages such as initial single‑server setups, application‑data separation, caching, server clustering, read‑write separation, CDN/reverse proxy, distributed storage, micro‑services, and automation to achieve high availability, scalability, performance and security.

Distributed SystemsScalabilityarchitecture

0 likes · 21 min read

Designing High‑Availability, High‑Performance, Scalable and Secure Architecture for Large Web Applications

Java Architect Essentials

Nov 22, 2021 · Databases

10 Best Practices for Using Redis Effectively

This article outlines ten essential Redis best‑practice tips, covering why to avoid the KEYS * command, using SCAN, interpreting INFO stats, leveraging hashes, setting key expirations, choosing eviction policies, handling errors, scaling with clusters, CPU considerations, and ensuring high availability with Sentinel.

best practicesdatabaseshigh availability

0 likes · 8 min read

10 Best Practices for Using Redis Effectively

Architects' Tech Alliance

Nov 22, 2021 · Operations

How to Build a High‑Availability, High‑Performance, Scalable Web Architecture

This article analyzes the evolution of large‑scale website architecture, covering stages from single‑server setups to layered, distributed, and clustered designs, and explains how caching, read‑write separation, CDN, asynchronous messaging, redundancy, automation, and security collectively achieve high performance, availability, scalability, and extensibility.

Distributed SystemsScalabilityarchitecture

0 likes · 21 min read

How to Build a High‑Availability, High‑Performance, Scalable Web Architecture

Full-Stack DevOps & Kubernetes

Nov 21, 2021 · Cloud Native

Boost Kubernetes Ingress Performance: Tuning Nginx Keep‑Alive for Double QPS

A Kubernetes‑deployed business app sees its QPS drop from over 100k with a NodePort service to about 50k when exposed via Ingress, but adjusting Nginx keep‑alive parameters in the ingress‑controller can restore and even exceed the original performance while also enabling high availability.

IngressKubernetesNginx

0 likes · 4 min read

Boost Kubernetes Ingress Performance: Tuning Nginx Keep‑Alive for Double QPS

IT Architects Alliance

Nov 19, 2021 · Backend Development

Technical Summary of Large‑Scale Distributed Website Architecture

This article provides a comprehensive overview of the design principles, architectural patterns, performance, availability, scalability, security, and operational considerations for building large distributed web sites, illustrated with a step‑by‑step evolution from a single‑server setup to a multi‑layer, cloud‑native architecture.

Distributed SystemsMicroservicesScalability

0 likes · 22 min read

Technical Summary of Large‑Scale Distributed Website Architecture

NiuNiu MaTe

Nov 17, 2021 · Databases

Mastering MySQL Disaster Recovery: Replication Modes and Strategies

This article explains MySQL disaster‑recovery techniques, covering cold and hot backups, same‑city versus remote setups, master‑slave topologies, async, semi‑sync and full‑sync replication, the MAR strong‑sync approach, and practical recommendations for building resilient two‑city three‑center architectures.

Replicationdatabasedisaster recovery

0 likes · 10 min read

Mastering MySQL Disaster Recovery: Replication Modes and Strategies

Open Source Linux

Nov 13, 2021 · Operations

How to Build High‑Availability Load Balancing with Keepalived and HAProxy

This guide explains how to configure Keepalived and HAProxy on Linux to achieve software load balancing and high availability, covering installation, core features, VRRP-based failover, health checks, session persistence, SSL offloading, and traffic routing with practical configuration examples.

HAProxyLinuxhigh availability

0 likes · 25 min read

How to Build High‑Availability Load Balancing with Keepalived and HAProxy

Full-Stack Internet Architecture

Nov 12, 2021 · Databases

Implementing High‑Availability PostgreSQL with Keepalived: Architecture, Setup, and Failover Procedures

This article explains how to use Keepalived together with PostgreSQL to build a two‑node high‑availability cluster, covering Keepalived's VRRP mechanism, host planning, installation steps, asynchronous master‑slave replication configuration, monitoring scripts, and detailed failover drills.

Database ReplicationVRRPfailover

0 likes · 20 min read

Implementing High‑Availability PostgreSQL with Keepalived: Architecture, Setup, and Failover Procedures

Full-Stack Internet Architecture

Nov 11, 2021 · Databases

Understanding Redis Sentinel: Architecture, Configuration, and Automatic Failover

This article explains Redis Sentinel’s role in high‑availability deployments, covering its architecture, monitoring and notification mechanisms, automatic failover process, configuration steps for master‑slave and sentinel nodes, and practical guidelines for building a reliable Redis cluster.

Operationsdatabasefailover

0 likes · 21 min read

Understanding Redis Sentinel: Architecture, Configuration, and Automatic Failover

Qingyun Technology Community

Nov 10, 2021 · Cloud Computing

How to Build High‑Concurrency, High‑Performance, High‑Availability Apps on the Cloud

This article explains the concepts of high concurrency, high performance, and high availability, and demonstrates how to design cloud‑native IaaS, PaaS, and SaaS layers to keep internet services resilient during massive traffic spikes such as Double‑11.

Distributed Systemscloud computinghigh availability

0 likes · 14 min read

How to Build High‑Concurrency, High‑Performance, High‑Availability Apps on the Cloud

Baidu Geek Talk

Nov 10, 2021 · Operations

How etcd Powers Scalable Service Governance: Raft, BoltDB, and Real‑World Practices

This article explores service governance fundamentals, examines why etcd’s Raft‑based consensus and BoltDB storage make it ideal for large‑scale systems, compares it with ZooKeeper and Consul, and shares Baidu’s practical architecture, performance tricks, and operational metrics for high‑availability, high‑performance service management.

BoltDBDistributed SystemsRaft consensus

0 likes · 23 min read

How etcd Powers Scalable Service Governance: Raft, BoltDB, and Real‑World Practices

Java Architect Essentials

Nov 8, 2021 · Operations

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High Availability

This article explains how to horizontally scale RabbitMQ clusters to achieve million‑message per second throughput, covering Google’s large‑scale test setup, sharding and consistent‑hash plugins, federation, high‑availability mirroring, reliability mechanisms, and practical deployment tips for production environments.

RabbitMQhigh availabilitysharding

0 likes · 23 min read

Architect

Nov 8, 2021 · Operations

Designing High Availability for Canal Using Zookeeper: Distributed Locks and Watch Mechanism

This article explains how to achieve high availability for Canal by designing a Zookeeper‑based distributed lock and watch mechanism, covering primary‑backup role election, failure detection, thundering‑herd mitigation, fair locking, node types, watcher events, and practical Zookeeper applications such as service registration and configuration management.

CanalZooKeeperdistributed-lock

0 likes · 13 min read

Designing High Availability for Canal Using Zookeeper: Distributed Locks and Watch Mechanism

Alibaba Terminal Technology

Nov 5, 2021 · Mobile Development

How Alipay’s Mobile Client Uses Fuzz Testing to Prevent Crashes

This article describes Alipay’s client‑side high‑availability strategy that combines offline risk mining, function‑interface “minesweeping”, RPC/config/jsapi checks, and automated fuzz testing on Android and iOS to detect and eliminate crash‑inducing bugs before release.

automationclient stabilityfunction interface

0 likes · 7 min read

How Alipay’s Mobile Client Uses Fuzz Testing to Prevent Crashes