Tagged articles
1414 articles
Page 6 of 15
HelloTech
HelloTech
Apr 29, 2022 · Operations

Multi-Active Deployment Modes, Routing Rules, and Unitization in Data Center Architecture

The article explains multi‑active deployment modes—same‑city, cross‑city, and cross‑city multi‑active—along with routing rules (random, user‑ID, region) and unitization, describing how sharding IDs map traffic to central or unit data centers, and detailing middleware solutions for storage, messaging, SOA, and Snowflake ID generation to achieve scalable, highly available architecture.

Active-Activedata centerhigh availability
0 likes · 10 min read
Multi-Active Deployment Modes, Routing Rules, and Unitization in Data Center Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 26, 2022 · Big Data

ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices

The article details ByteDance's large‑scale deployment of the Presto OLAP engine for ad‑hoc, BI, and near‑real‑time analytics, describing its architecture, multi‑coordinator high‑availability design, routing gateway, adaptive cancel, history server, materialized‑view support, Hudi connector integration, and how these innovations improve performance, stability, and operational efficiency.

Big DataHudi ConnectorMaterialized Views
0 likes · 11 min read
ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices
Architect
Architect
Apr 24, 2022 · Backend Development

Comprehensive Nginx Tutorial: Installation, Configuration, Reverse Proxy, Load Balancing, and High Availability

This guide provides a step‑by‑step tutorial on Nginx, covering its overview, single‑instance installation, reverse proxy setup, load‑balancing configuration, static‑dynamic separation, high‑availability clustering with keepalived, detailed configuration directives, and performance considerations, complete with command‑line examples and code snippets.

LinuxNginxhigh availability
0 likes · 30 min read
Comprehensive Nginx Tutorial: Installation, Configuration, Reverse Proxy, Load Balancing, and High Availability
macrozheng
macrozheng
Apr 14, 2022 · Operations

Mastering High Availability: 4 Essential Design Techniques for Scalable Systems

This article outlines the core high‑availability techniques—system splitting, decoupling, asynchronous processing, retry, compensation, backup, multi‑active strategies, isolation, rate limiting, circuit breaking, and degradation—providing practical guidance for designing resilient, scalable backend architectures in large‑scale internet applications.

Distributed SystemsMicroservicesSystem Design
0 likes · 13 min read
Mastering High Availability: 4 Essential Design Techniques for Scalable Systems
Liangxu Linux
Liangxu Linux
Apr 12, 2022 · Operations

Mastering Nginx: Installation, Core Configuration, and High‑Availability Setup

This comprehensive guide explains Nginx's role as a high‑performance web and reverse‑proxy server, details its installation packages and step‑by‑step build process, breaks down the main nginx.conf sections, and demonstrates practical configurations for reverse proxy, load balancing, static‑dynamic separation, worker tuning, and high‑availability clustering.

LinuxNginxWeb server
0 likes · 15 min read
Mastering Nginx: Installation, Core Configuration, and High‑Availability Setup
DataFunTalk
DataFunTalk
Apr 12, 2022 · Big Data

Kuaishou Big Data Task Scheduling System: Architecture, Challenges, and Key Technologies

This article presents Kuaishou's large‑scale big‑data task scheduling system, describing its evolution from Airflow to the self‑developed Kwaiflow, the performance and reliability challenges of handling hundreds of thousands of tasks, and the design decisions that achieve low latency, high availability, and strong open capabilities.

Distributed SystemsKuaishouKwaiflow
0 likes · 22 min read
Kuaishou Big Data Task Scheduling System: Architecture, Challenges, and Key Technologies
HomeTech
HomeTech
Apr 6, 2022 · Databases

MySQL High Availability Architecture and Practices at AutoHome

This article explains MySQL high‑availability concepts, defines HA, RPO and RTO, outlines common HA architectures such as master‑slave+VIP, MHA and MGR+Proxy, and details AutoHome's evolution from simple master‑slave setups to a container‑based MGR solution with automated failover and monitoring platforms.

KubernetesMGRMHA
0 likes · 11 min read
MySQL High Availability Architecture and Practices at AutoHome
21CTO
21CTO
Apr 3, 2022 · Backend Development

How We Achieved 20k+ TPS High Availability for a Billion‑User Membership System

This article details the design and implementation of a highly available, high‑performance membership system serving over a billion users, covering Elasticsearch dual‑center clusters, traffic isolation, Redis caching, MySQL migration, and fine‑grained flow‑control and degradation strategies.

ElasticsearchScalabilitySystem Architecture
0 likes · 21 min read
How We Achieved 20k+ TPS High Availability for a Billion‑User Membership System
Architects' Tech Alliance
Architects' Tech Alliance
Apr 2, 2022 · Industry Insights

How Financial Institutions Secure Database Continuity: Disaster Recovery Strategies & Market Trends

This article examines the critical role of databases in finance, defines disaster recovery and backup concepts, outlines industry requirements and regulations, analyzes market growth, and compares distributed database disaster‑recovery architectures such as single‑center, city‑level mutual backup, active‑active, and two‑site three‑center solutions.

BackupDistributed SystemsFinancial Services
0 likes · 15 min read
How Financial Institutions Secure Database Continuity: Disaster Recovery Strategies & Market Trends
dbaplus Community
dbaplus Community
Apr 1, 2022 · Databases

How iQIYI Built a Scalable OLTP Data Center to Eliminate Data Silos

This article details iQIYI's design and implementation of a unified OLTP data center that consolidates data across business lines, solves data‑island issues, ensures strong consistency between MongoDB and Elasticsearch, and provides high‑availability, massive‑scale storage for billions of records.

Data ArchitectureElasticsearchMongoDB
0 likes · 12 min read
How iQIYI Built a Scalable OLTP Data Center to Eliminate Data Silos
21CTO
21CTO
Mar 31, 2022 · Operations

What Caused the Biggest 2021 Outages? Lessons from Bilibili, Facebook, AWS, and More

The article reviews ten major 2021 service outages—from Chinese platforms like Bilibili and Futu to global giants such as Facebook, Roblox, and AWS—analyzing their root causes, redundancy failures, and the operational lessons needed to prevent future black‑swans.

high availabilityincident responseoutage analysis
0 likes · 15 min read
What Caused the Biggest 2021 Outages? Lessons from Bilibili, Facebook, AWS, and More
Top Architect
Top Architect
Mar 31, 2022 · Operations

Comprehensive Nginx Tutorial: Reverse Proxy, Load Balancing, Static/Dynamic Separation, and High Availability with Keepalived

This article provides a detailed guide on using Nginx for high‑performance HTTP serving, reverse proxying, load balancing, static‑dynamic separation, installation commands, configuration file structure, practical examples with Tomcat back‑ends, and setting up high‑availability using Keepalived, complete with code snippets and diagrams.

Server Configurationhigh availabilitykeepalived
0 likes · 10 min read
Comprehensive Nginx Tutorial: Reverse Proxy, Load Balancing, Static/Dynamic Separation, and High Availability with Keepalived
Java Interview Crash Guide
Java Interview Crash Guide
Mar 31, 2022 · Backend Development

How We Achieved 20k TPS High‑Availability for a Billion‑User Membership System

This article details the design and implementation of a highly available, high‑performance membership system serving billions of users, covering Elasticsearch dual‑center clusters, traffic‑isolated architectures, deep ES optimizations, Redis caching with distributed locks, dual‑center MySQL partitioning, migration strategies, abnormal account handling, and future fine‑grained flow‑control and degradation policies.

Distributed SystemsElasticsearchScalability
0 likes · 20 min read
How We Achieved 20k TPS High‑Availability for a Billion‑User Membership System
Senior Brother's Insights
Senior Brother's Insights
Mar 29, 2022 · Backend Development

Dual‑Center Elasticsearch & Multi‑Cluster Redis Power 20k+ TPS for Billion‑User Membership

This article explains how a large‑scale membership system serving over a billion users achieved high performance and availability by deploying dual‑center Elasticsearch clusters, traffic‑isolated ES clusters, deep ES optimizations, a Redis caching layer with dual‑center replication, and a seamless migration from SQL Server to sharded MySQL, while also outlining future fine‑grained flow‑control and degradation strategies.

Backend Architecturehigh availabilityredis
0 likes · 20 min read
Dual‑Center Elasticsearch & Multi‑Cluster Redis Power 20k+ TPS for Billion‑User Membership
Java Interview Crash Guide
Java Interview Crash Guide
Mar 29, 2022 · Cloud Native

How to Build a Scalable Service Registry for Microservices

This article explains how to design a service registry that enables service registration, discovery, high availability, and dynamic handling of service instances in a microservice architecture, covering registration methods, consumer/provider interaction, push/pull mechanisms, long‑polling, and heartbeat health checks.

Microserviceshigh availabilitylong polling
0 likes · 8 min read
How to Build a Scalable Service Registry for Microservices
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 28, 2022 · Operations

How to Implement Robust Rate Limiting with Alibaba Cloud AHAS

This guide explains how to use Alibaba Cloud's Application High Availability Service (AHAS) to monitor QPS, define granular rate‑limiting rules, prevent abuse, isolate upstream failures, and protect both HTTP and non‑HTTP workloads in microservice architectures.

AHASAlibaba CloudMicroservices
0 likes · 10 min read
How to Implement Robust Rate Limiting with Alibaba Cloud AHAS
IT Architects Alliance
IT Architects Alliance
Mar 23, 2022 · Operations

Designing High‑Performance, Highly‑Available, Scalable E‑Commerce Architecture

This article provides a comprehensive technical guide on building large‑scale distributed websites, covering characteristics, architectural goals, patterns, performance, high‑availability, scalability, extensibility, security, agility, and a detailed e‑commerce case study with practical diagrams and capacity estimations.

Distributed SystemsScalabilitycaching
0 likes · 26 min read
Designing High‑Performance, Highly‑Available, Scalable E‑Commerce Architecture
Top Architect
Top Architect
Mar 20, 2022 · Backend Development

High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, MySQL Migration, and Flow‑Control Strategies

The article details a comprehensive high‑availability solution for a large‑scale membership system, covering Elasticsearch dual‑center master‑slave clusters, traffic‑isolated three‑cluster designs, deep ES optimizations, Redis caching with consistency safeguards, MySQL partitioned migration, and fine‑grained flow‑control and degradation mechanisms.

ElasticsearchFlow Controlhigh availability
0 likes · 19 min read
High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, MySQL Migration, and Flow‑Control Strategies
Tencent Cloud Developer
Tencent Cloud Developer
Mar 18, 2022 · Industry Insights

How Tencent Cloud WeDa Low‑Code Platform Enables Secure, Scalable Enterprise Apps

This article provides an in‑depth analysis of Tencent Cloud WeDa low‑code platform, covering its definition, evolution, market adoption, core capabilities, architecture, backend practices, development workflow, high‑availability design, and future trends, while explaining why low‑code boosts efficiency and digital transformation.

Digital TransformationIndustry Analysiscloud computing
0 likes · 19 min read
How Tencent Cloud WeDa Low‑Code Platform Enables Secure, Scalable Enterprise Apps
Architecture Digest
Architecture Digest
Mar 18, 2022 · Backend Development

High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, and MySQL Migration

This article details the design and implementation of a high‑performance, highly available membership system, covering Elasticsearch dual‑center master‑slave clusters, traffic‑isolated three‑cluster ES architecture, Redis cache strategies, MySQL dual‑center partitioning, seamless migration, abnormal member handling, and fine‑grained flow‑control and degradation policies.

ElasticsearchFlow ControlSystem Architecture
0 likes · 20 min read
High‑Availability Architecture for a Membership System: Elasticsearch Dual‑Center Cluster, Redis Caching, and MySQL Migration
Tencent Architect
Tencent Architect
Mar 16, 2022 · Cloud Computing

Tencent RegionEIP: High‑Performance Networking with X86 & P4

This article explains how Tencent's RegionEIP combines X86‑based load distributors and P4 programmable switches to deliver high‑performance, highly available public network access for cloud services, detailing zone disaster recovery, traffic‑shaping algorithms, four‑level routing priority and port‑redundancy designs.

P4Tencent Cloudcloud networking
0 likes · 14 min read
Tencent RegionEIP: High‑Performance Networking with X86 & P4
Cloud Native Technology Community
Cloud Native Technology Community
Mar 15, 2022 · Databases

How to Build a High‑Availability MySQL PXC Cluster: Installation & Features

This guide explains the Percona XtraDB Cluster (PXC) architecture, its advantages and limitations, and provides step‑by‑step commands for removing MariaDB, opening firewall ports, disabling SELinux, downloading packages, configuring MySQL, bootstrapping the first node, adding additional nodes, and verifying the cluster status.

ClusterInstallationPXC
0 likes · 8 min read
How to Build a High‑Availability MySQL PXC Cluster: Installation & Features
21CTO
21CTO
Mar 13, 2022 · Backend Development

How Meituan Built a Fault‑Tolerant Instant Logistics Platform at Scale

Meituan’s instant logistics platform evolved from vertical services to a micro‑service, distributed architecture that handles massive order‑rider matching, ultra‑low latency, and high availability, leveraging AI for pricing, ETA, scheduling, and employing robust scaling, consistency, and disaster‑recovery techniques.

AIDistributed SystemsLogistics
0 likes · 10 min read
How Meituan Built a Fault‑Tolerant Instant Logistics Platform at Scale
AntTech
AntTech
Mar 12, 2022 · Operations

Evolution of Large‑Scale Distributed System Stability at Ant Group

The article outlines Ant Group's multi‑stage journey of building large‑scale distributed system stability, describing architectural evolutions, risk‑inspection mechanisms, high‑availability solutions such as LDC and fine‑grained traffic scheduling, and intelligent risk‑defense products that together enable resilient, cost‑effective operations.

Cloud NativeDistributed SystemsOperations
0 likes · 15 min read
Evolution of Large‑Scale Distributed System Stability at Ant Group
Efficient Ops
Efficient Ops
Mar 6, 2022 · Operations

Mastering Redis Sentinel: Build High‑Availability Clusters Step‑by‑Step

This article explains Redis Sentinel’s role in achieving high availability, details its core functions, underlying Raft‑based algorithm, configuration parameters, practical setup steps, fault‑tolerance mechanisms, quorum and majority calculations, and demonstrates failover and recovery scenarios with real command‑line examples.

failoverhigh availabilityredis
0 likes · 20 min read
Mastering Redis Sentinel: Build High‑Availability Clusters Step‑by‑Step
Kuaishou Big Data
Kuaishou Big Data
Mar 3, 2022 · Big Data

How Kwai’s OneService Platform Revolutionizes Data Service Development

The article details Kwai’s OneService platform—a low‑code, self‑service data platform that streamlines API creation, deployment, and operation, covering its background, architecture, key technologies such as API matrix and configuration‑as‑code, high‑availability and performance strategies, achieved results, and future roadmap.

API ServiceData Platformhigh availability
0 likes · 20 min read
How Kwai’s OneService Platform Revolutionizes Data Service Development
dbaplus Community
dbaplus Community
Mar 1, 2022 · Databases

MHA Re-Edition: Modern MySQL HA with GTID Failover and Auto Switch

The MHA Re-Edition tool revives the discontinued MHA manager for MySQL, adding GTID‑based failover, password‑only SSH authentication, lightweight binaries, VIP migration, WeChat alerts, remote‑card reboot, and detailed configuration options, with step‑by‑step deployment instructions and sample app1.cnf parameters for high‑availability clusters.

GTIDMHAdatabase
0 likes · 11 min read
MHA Re-Edition: Modern MySQL HA with GTID Failover and Auto Switch
Efficient Ops
Efficient Ops
Feb 23, 2022 · Operations

Why a Single Kafka Broker Crash Can Halt All Consumers – The HA Explained

An in‑depth look at Kafka’s high‑availability architecture reveals how multi‑replica redundancy, ISR mechanisms, and the configuration of the __consumer_offset topic interact, explaining why a single broker failure can render the entire cluster unusable and how to properly configure replication and ack settings to prevent it.

ACKConsumer OffsetISR
0 likes · 10 min read
Why a Single Kafka Broker Crash Can Halt All Consumers – The HA Explained
IT Architects Alliance
IT Architects Alliance
Feb 22, 2022 · Backend Development

Designing a Scalable, High‑Performance Distributed E‑Commerce Architecture: A Technical Guide

This article provides a comprehensive technical overview of large‑scale distributed website architecture, covering characteristics, goals, patterns, high‑performance, high‑availability, scalability, extensibility, security, and agility considerations, and walks through the evolution of an e‑commerce system with concrete examples and diagrams.

Distributed SystemsMicroservicesScalability
0 likes · 26 min read
Designing a Scalable, High‑Performance Distributed E‑Commerce Architecture: A Technical Guide
Tencent Cloud Developer
Tencent Cloud Developer
Feb 21, 2022 · Backend Development

Design and Engineering Practices of a Billion‑Scale Node.js Gateway

Wang Weijia’s talk outlines the architecture and engineering of Tencent CloudBase’s billion‑scale Node.js gateway—built with Nest.js, layered controllers and services, async streaming, keep‑alive connections, a two‑level cache with refresh‑ahead, and HA measures like horizontal scaling, rate limiting, multi‑AZ deployment, and disaster‑recovery caching—delivering 99.98% cache hits, 14 ms median latency, and proving Node.js can power latency‑sensitive services while encouraging front‑end engineers to adopt backend practices.

Cloud NativeNode.jsgateway architecture
0 likes · 33 min read
Design and Engineering Practices of a Billion‑Scale Node.js Gateway
Ops Development Stories
Ops Development Stories
Feb 17, 2022 · Cloud Native

How to Build a Minimal‑Cost HA Harbor Registry with PostgreSQL Replication on Alibaba Cloud

This guide details a low‑overhead, highly available Harbor deployment on Alibaba Cloud, covering preparation of SLB, ECS, NFS storage, installation of Docker‑Compose, configuration of image mirrors, installation of Harbor 2.3, setup of PostgreSQL 13 master‑slave replication, Redis integration, backup procedures, failover handling, and disaster‑recovery strategies.

Alibaba CloudDockerHarbor
0 likes · 20 min read
How to Build a Minimal‑Cost HA Harbor Registry with PostgreSQL Replication on Alibaba Cloud
Alibaba Cloud Native
Alibaba Cloud Native
Feb 10, 2022 · Cloud Native

How Multi-Active Architecture Can Eliminate Downtime: Inside Alibaba Cloud’s AppActive

Despite widespread cloud adoption, large‑scale outages still occur, prompting Alibaba Cloud’s high‑availability team to share the evolution, principles, and open‑source implementation of multi‑active disaster recovery (AppActive) that aims to achieve minute‑level failover and near‑zero downtime.

Alibaba CloudAppActivedisaster recovery
0 likes · 11 min read
How Multi-Active Architecture Can Eliminate Downtime: Inside Alibaba Cloud’s AppActive
Laravel Tech Community
Laravel Tech Community
Feb 8, 2022 · Operations

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability

This article provides a step‑by‑step tutorial on installing Nginx, setting up reverse‑proxy and various load‑balancing methods, configuring SSL, deploying Keepalived for failover, and building an LVS‑DR high‑availability cluster with detailed command examples and configuration snippets.

LVSNginxSSL
0 likes · 20 min read
Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability
Selected Java Interview Questions
Selected Java Interview Questions
Feb 5, 2022 · Backend Development

Message Queue Fundamentals: Use Cases, Product Comparison, High Availability, and Reliability Strategies

This article explains why message queues are used, outlines common scenarios such as decoupling, asynchronous processing and traffic shaping, compares major MQ products, and provides practical guidance on high availability, preventing loss, duplicate consumption, ordering, backlog handling, and expiration.

KafkaMessage QueueRabbitMQ
0 likes · 8 min read
Message Queue Fundamentals: Use Cases, Product Comparison, High Availability, and Reliability Strategies
IT Architects Alliance
IT Architects Alliance
Feb 2, 2022 · Fundamentals

Strategic and Tactical Design Principles for Technical Architecture

This article explains how technical architecture transforms product requirements into implementation by addressing layering, language choices, and non‑functional concerns, and introduces strategic principles of suitability, simplicity, and evolution along with tactical guidelines for high concurrency, high availability, and business design.

System Designdesign principleshigh availability
0 likes · 14 min read
Strategic and Tactical Design Principles for Technical Architecture
Architect
Architect
Jan 25, 2022 · Databases

Designing a High‑Availability Redis Service with Sentinel

This article explains why Redis needs high availability, defines failure scenarios, compares several HA architectures—including single‑instance, master‑slave with one or multiple Sentinel processes, and a three‑node solution with a virtual IP—and provides practical guidance for building a reliable Redis service.

Operationshigh availabilityredis
0 likes · 12 min read
Designing a High‑Availability Redis Service with Sentinel
Top Architect
Top Architect
Jan 25, 2022 · Big Data

Elasticsearch Cluster Deployment and Management Guide (Mac/Windows)

This article explains why Elasticsearch should run in a cluster, describes the cluster concept, provides step‑by‑step configuration for three nodes on macOS/Windows, demonstrates health checks, failover, horizontal scaling, routing calculations, shard control, and the read/write workflow, all illustrated with code snippets and screenshots.

ClusterElasticsearchhigh availability
0 likes · 10 min read
Elasticsearch Cluster Deployment and Management Guide (Mac/Windows)
Architecture Digest
Architecture Digest
Jan 23, 2022 · Operations

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability

This tutorial walks through installing Nginx, setting up upstream reverse‑proxy rules, configuring various load‑balancing algorithms, enabling SSL, deploying Keepalived for failover, and building an LVS‑DR high‑availability cluster with detailed commands and configuration examples.

LVSSSLhigh availability
0 likes · 24 min read
Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, and LVS High‑Availability
IT Architects Alliance
IT Architects Alliance
Jan 20, 2022 · Cloud Native

How to Build a High‑Availability Microservices System on Kubernetes – A Complete Guide

This guide walks you through designing a simple front‑back separation microservice architecture, implementing it with Java Spring Boot, deploying multiple instances with Eureka, adding Prometheus‑Grafana monitoring, logging, tracing, flow control, and finally installing Kubernetes using K8seasy and verifying high‑availability across the cluster.

Cloud NativeKubernetesMicroservices
0 likes · 19 min read
How to Build a High‑Availability Microservices System on Kubernetes – A Complete Guide
Tencent Database Technology
Tencent Database Technology
Jan 19, 2022 · Databases

Deep Dive into Tencent's Self‑Developed MySQL Kernel TXSQL and Its Architecture

This article provides a comprehensive overview of Tencent's self‑developed MySQL kernel TXSQL, covering its evolution, overall architecture, columnar storage engine, instant DDL capabilities, enterprise‑grade features, high‑availability mechanisms, performance optimizations, and the rigorous development and testing processes behind the product.

Columnar StorageTXSQLcloud database
0 likes · 11 min read
Deep Dive into Tencent's Self‑Developed MySQL Kernel TXSQL and Its Architecture
DeWu Technology
DeWu Technology
Jan 19, 2022 · Operations

Common High‑Availability Architecture Patterns and Multi‑Active Deployment Strategies

Covering essential high‑availability techniques, the article examines disaster‑recovery architectures from same‑city dual‑center to cross‑country active‑passive deployments, compares five patterns, details three multi‑active models, outlines required traffic‑scheduling, replication, and database layers, and provides design methodology, practical safeguards, and key HA metrics.

Distributed Systemsdata replicationdisaster recovery
0 likes · 23 min read
Common High‑Availability Architecture Patterns and Multi‑Active Deployment Strategies
Top Architect
Top Architect
Jan 15, 2022 · Backend Development

Technical Architecture Design Principles: Strategy and Tactics for Backend Systems

This article explains how to design robust backend technical architectures by addressing strategic principles such as suitability, simplicity, and evolution, and tactical guidelines covering high concurrency, high availability, and business design, while illustrating logical and physical architecture diagrams and practical implementation tips.

Software ArchitectureSystem DesignTechnical architecture
0 likes · 14 min read
Technical Architecture Design Principles: Strategy and Tactics for Backend Systems
IT Architects Alliance
IT Architects Alliance
Jan 15, 2022 · Backend Development

How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google and Real‑World Experiments

This article explains the fundamentals of RabbitMQ, compares normal and mirrored cluster modes, details Google’s large‑scale test setup, and walks through advanced plugins such as sharding, consistent‑hash exchange, federation, and high‑availability strategies for achieving million‑level message throughput.

BackendMessage QueueRabbitMQ
0 likes · 24 min read
How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google and Real‑World Experiments
Architecture Digest
Architecture Digest
Jan 13, 2022 · Backend Development

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High‑Availability Practices

This article explains how to horizontally scale RabbitMQ clusters to handle millions of messages per second by leveraging cluster modes, mirror queues, sharding plugins, consistent‑hash exchanges, federation, and high‑availability configurations, while also covering practical scenarios such as retries, delayed tasks, and Spring AMQP integration.

FederationMessage QueueRabbitMQ
0 likes · 22 min read
Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High‑Availability Practices
IT Xianyu
IT Xianyu
Jan 9, 2022 · Operations

Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, LVS, and High‑Availability Clusters

This tutorial walks through installing Nginx from source, setting up upstream reverse‑proxy groups, configuring various load‑balancing methods (weight, IP hash, URL hash, least connections), enabling SSL, deploying Keepalived for failover, and building an LVS‑DR high‑availability cluster with detailed command‑line examples.

LVSNginxSSL
0 likes · 23 min read
Comprehensive Guide to Installing Nginx, Configuring Reverse Proxy, Load Balancing, SSL, Keepalived, LVS, and High‑Availability Clusters
Architects Research Society
Architects Research Society
Jan 7, 2022 · Databases

High‑Availability Clustering Solutions for PostgreSQL

This article explains the concepts of high availability, continuous recovery, and standby databases, then reviews various PostgreSQL clustering options such as DRBD, ClusterControl, Rubyrep, Pgpool‑II, Bucardo, Postgres‑XC, Citus, and PostgresXL, highlighting their features, advantages, and drawbacks.

ClusterControlDRBDDatabase Replication
0 likes · 16 min read
High‑Availability Clustering Solutions for PostgreSQL
Ctrip Technology
Ctrip Technology
Jan 6, 2022 · Cloud Native

High‑Availability Architecture and Performance Optimizations for Service Mesh at Ctrip

This article describes Ctrip's cloud‑native Service Mesh deployment, detailing its multi‑IDC high‑availability design, fault‑scenario analysis, xDS push metrics, event‑handling optimizations, cold‑start improvements, and progressive canary release strategies to ensure reliable, scalable service traffic management.

Cloud NativeService Meshcanary release
0 likes · 16 min read
High‑Availability Architecture and Performance Optimizations for Service Mesh at Ctrip
Architects Research Society
Architects Research Society
Jan 1, 2022 · Cloud Native

Running Kubernetes Across Multiple Failure Zones

This article explains how Kubernetes clusters can be deployed across multiple failure zones and regions, detailing control plane replication, node labeling, pod topology constraints, storage zone awareness, network considerations, and disaster recovery strategies to achieve high availability in cloud‑native environments.

Cloud NativeCluster DesignKubernetes
0 likes · 8 min read
Running Kubernetes Across Multiple Failure Zones
Tencent Architect
Tencent Architect
Dec 30, 2021 · Databases

Practices and Exploration of Disaster Recovery in Tencent Cloud‑Native Database TDSQL‑C (formerly CynosDB)

This article examines the architecture differences between cloud‑native TDSQL‑C and traditional MySQL, outlines TDSQL‑C’s elastic, serverless, low‑latency features, compares MySQL disaster‑recovery models, and details the multi‑dimensional disaster‑recovery system and its cross‑AZ/Region challenges and solutions.

TDSQL-Ccloud-native databasedisaster recovery
0 likes · 9 min read
Practices and Exploration of Disaster Recovery in Tencent Cloud‑Native Database TDSQL‑C (formerly CynosDB)
Alibaba Cloud Native
Alibaba Cloud Native
Dec 23, 2021 · Cloud Native

Designing High‑Availability for Microservices: Service Discovery & Config Management Best Practices

This article walks through a real‑world microservice outage, analyzes the risk chain, presents four high‑availability strategies, details service‑discovery and configuration‑management HA designs, and provides a step‑by‑step Kubernetes demo with code, monitoring, fault injection and results.

Configuration ManagementMicroserviceshigh availability
0 likes · 20 min read
Designing High‑Availability for Microservices: Service Discovery & Config Management Best Practices
High Availability Architecture
High Availability Architecture
Dec 23, 2021 · Fundamentals

Master Data Management Architecture and Practices for Baidu Smart Mini Programs

This article presents a comprehensive overview of master data management concepts, maturity levels, and the challenges faced by Baidu smart mini‑programs, followed by a detailed practical architecture design—including domain modeling, high‑availability microservice implementation, performance optimization, and data synchronization—while also discussing future extensions and team capability building.

Baidu Mini ProgramsData ArchitectureMaster Data Management
0 likes · 14 min read
Master Data Management Architecture and Practices for Baidu Smart Mini Programs
Top Architect
Top Architect
Dec 22, 2021 · Operations

Load Balancing: Principles, Types, and Algorithms

This article explains the fundamentals of load balancing, covering its purpose, vertical and horizontal scaling, various classifications such as DNS, IP, link‑layer and hybrid methods, common algorithms like round‑robin and weighted, as well as hardware solutions, providing a comprehensive guide for building scalable, high‑availability systems.

AlgorithmsDistributed Systemshigh availability
0 likes · 13 min read
Load Balancing: Principles, Types, and Algorithms
IT Architects Alliance
IT Architects Alliance
Dec 22, 2021 · Industry Insights

Mastering Technical Architecture: Strategic & Tactical Design Principles for Scalable Systems

This article explains how to transform product requirements into robust technical architectures by addressing uncertainty through strategic principles—suitability, simplicity, evolution—and tactical guidelines covering high concurrency, high availability, and business design, illustrated with logical and physical diagrams.

ScalabilitySoftware Architecturedesign principles
0 likes · 14 min read
Mastering Technical Architecture: Strategic & Tactical Design Principles for Scalable Systems
21CTO
21CTO
Dec 20, 2021 · Fundamentals

Mastering Software Architecture: Strategic & Tactical Design Principles

This article explores how to transform product requirements into robust technical architectures by addressing uncertainty, outlining strategic principles—appropriateness, simplicity, evolution—and tactical guidelines for high concurrency, high availability, and business design, while illustrating logical and physical architecture diagrams.

Software ArchitectureSystem Designdesign principles
0 likes · 14 min read
Mastering Software Architecture: Strategic & Tactical Design Principles
HomeTech
HomeTech
Dec 14, 2021 · Databases

TiDB Cross-Data-Center High Availability Using Binlog Bidirectional Replication

This article summarizes the design, working principle, deployment steps, testing results, and future outlook of a TiDB cross-data-center high‑availability solution based on Binlog bidirectional replication, aiming to ensure rapid failover and continuous service between two data‑center clusters.

Bidirectional ReplicationBinlogCross‑Data‑Center
0 likes · 5 min read
TiDB Cross-Data-Center High Availability Using Binlog Bidirectional Replication
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Dec 14, 2021 · Backend Development

How NetEase Cloud’s Distributed Recording Cluster Ensures High‑Availability and Scalability

This article explains the architecture and key features of NetEase Cloud's local server‑side recording cluster, detailing how dynamic scaling, multi‑backup high availability, load‑balancing strategies, monitoring, and an embedded registration center enable secure, reliable, and scalable recording for data‑sensitive applications.

Distributed SystemsJava SDKREST API
0 likes · 11 min read
How NetEase Cloud’s Distributed Recording Cluster Ensures High‑Availability and Scalability
IT Architects Alliance
IT Architects Alliance
Dec 11, 2021 · Databases

Mastering Redis Replication and Sentinel: Solving Failover Challenges

This article examines the limitations of Redis master‑slave replication, explains how Redis Sentinel addresses those issues with monitoring, notification, and automatic failover, and provides detailed configuration commands, discovery mechanisms, and step‑by‑step failover procedures for building a highly available Redis deployment.

ConfigurationReplicationdatabase
0 likes · 12 min read
Mastering Redis Replication and Sentinel: Solving Failover Challenges
Top Architect
Top Architect
Dec 10, 2021 · Operations

Comprehensive Guide to Load Balancing: Principles, Types, Algorithms, and Hardware

This article explains the fundamentals of load balancing, covering why it is needed for high‑traffic services, the difference between vertical and horizontal scaling, various load‑balancing techniques (DNS, HTTP, IP, link‑layer, hybrid), common algorithms, and the trade‑offs of software versus hardware solutions.

Distributed SystemsNetworkingOperations
0 likes · 13 min read
Comprehensive Guide to Load Balancing: Principles, Types, Algorithms, and Hardware
Qingyun Technology Community
Qingyun Technology Community
Dec 7, 2021 · Databases

Master PostgreSQL Replication with repmgr: A Complete Guide

This article introduces repmgr, an open‑source PostgreSQL replication manager, covering its architecture, election mechanism, core tools, metadata tables, installation steps, command syntax, configuration options, and common operations for building high‑availability database clusters.

Replicationdatabase clusteringhigh availability
0 likes · 8 min read
Master PostgreSQL Replication with repmgr: A Complete Guide
NiuNiu MaTe
NiuNiu MaTe
Dec 7, 2021 · Databases

Master‑Slave Replication in Redis: How It Works and How to Prevent Data Loss

This article explains why a single‑instance Redis can cause outages, introduces the master‑slave architecture, details the full and incremental synchronization processes, shows how to configure replication, addresses multi‑slave scaling, network interruptions, and automatic failover with Sentinel.

Master‑SlaveReplicationdatabase
0 likes · 11 min read
Master‑Slave Replication in Redis: How It Works and How to Prevent Data Loss
Practical DevOps Architecture
Practical DevOps Architecture
Dec 5, 2021 · Databases

Deploying MHA for MySQL High Availability – Part 1

This guide walks through the step‑by‑step deployment of MHA on a MySQL cluster, covering package installation on all nodes, copying and installing the MHA RPMs, creating the required MySQL user, configuring MHA, testing SSH connectivity, and reviewing the failover script.

LinuxMHAdatabase
0 likes · 6 min read
Deploying MHA for MySQL High Availability – Part 1
MaGe Linux Operations
MaGe Linux Operations
Dec 1, 2021 · Operations

Scalable High‑Availability Prometheus: Small‑Scale to Massive Deployments

This article explains how Prometheus’s local storage limits scalability and how Remote Storage, federation, and high‑availability setups—using dual instances, keepalived, and adapters with PostgreSQL + TimescaleDB—can overcome data persistence and performance challenges for both small‑scale and large‑scale monitoring environments.

FederationPrometheusRemote Storage
0 likes · 5 min read
Scalable High‑Availability Prometheus: Small‑Scale to Massive Deployments
Software Development Quality
Software Development Quality
Nov 29, 2021 · Backend Development

Designing Scalable, High‑Performance Architecture for Large‑Scale Websites

Large‑scale website architecture must balance massive user traffic, data volume, security threats, and rapid feature changes by adopting layered, distributed designs that emphasize high performance, high availability, scalability, extensibility, and agility, employing techniques such as caching, load balancing, clustering, sharding, and service‑oriented components.

MicroservicesScalabilitycaching
0 likes · 22 min read
Designing Scalable, High‑Performance Architecture for Large‑Scale Websites
IT Architects Alliance
IT Architects Alliance
Nov 26, 2021 · Operations

Large-Scale Distributed Website Architecture: Principles, Patterns, and Practices

This article provides a comprehensive technical summary of large‑scale distributed website architecture, covering characteristics, goals, architectural patterns, performance, high‑availability, scalability, extensibility, security, agility, and a detailed evolution roadmap with practical examples and recommendations.

Distributed SystemsScalabilityarchitecture
0 likes · 22 min read
Large-Scale Distributed Website Architecture: Principles, Patterns, and Practices
IT Architects Alliance
IT Architects Alliance
Nov 24, 2021 · Operations

Designing High‑Availability, High‑Performance, Scalable and Secure Architecture for Large Web Applications

This article explains how to evolve a large‑scale website architecture through stages such as initial single‑server setups, application‑data separation, caching, server clustering, read‑write separation, CDN/reverse proxy, distributed storage, micro‑services, and automation to achieve high availability, scalability, performance and security.

Distributed SystemsScalabilityarchitecture
0 likes · 21 min read
Designing High‑Availability, High‑Performance, Scalable and Secure Architecture for Large Web Applications
Java Architect Essentials
Java Architect Essentials
Nov 22, 2021 · Databases

10 Best Practices for Using Redis Effectively

This article outlines ten essential Redis best‑practice tips, covering why to avoid the KEYS * command, using SCAN, interpreting INFO stats, leveraging hashes, setting key expirations, choosing eviction policies, handling errors, scaling with clusters, CPU considerations, and ensuring high availability with Sentinel.

best practicesdatabaseshigh availability
0 likes · 8 min read
10 Best Practices for Using Redis Effectively
Architects' Tech Alliance
Architects' Tech Alliance
Nov 22, 2021 · Operations

How to Build a High‑Availability, High‑Performance, Scalable Web Architecture

This article analyzes the evolution of large‑scale website architecture, covering stages from single‑server setups to layered, distributed, and clustered designs, and explains how caching, read‑write separation, CDN, asynchronous messaging, redundancy, automation, and security collectively achieve high performance, availability, scalability, and extensibility.

Distributed SystemsScalabilityarchitecture
0 likes · 21 min read
How to Build a High‑Availability, High‑Performance, Scalable Web Architecture
IT Architects Alliance
IT Architects Alliance
Nov 19, 2021 · Backend Development

Technical Summary of Large‑Scale Distributed Website Architecture

This article provides a comprehensive overview of the design principles, architectural patterns, performance, availability, scalability, security, and operational considerations for building large distributed web sites, illustrated with a step‑by‑step evolution from a single‑server setup to a multi‑layer, cloud‑native architecture.

Distributed SystemsMicroservicesScalability
0 likes · 22 min read
Technical Summary of Large‑Scale Distributed Website Architecture
NiuNiu MaTe
NiuNiu MaTe
Nov 17, 2021 · Databases

Mastering MySQL Disaster Recovery: Replication Modes and Strategies

This article explains MySQL disaster‑recovery techniques, covering cold and hot backups, same‑city versus remote setups, master‑slave topologies, async, semi‑sync and full‑sync replication, the MAR strong‑sync approach, and practical recommendations for building resilient two‑city three‑center architectures.

Replicationdatabasedisaster recovery
0 likes · 10 min read
Mastering MySQL Disaster Recovery: Replication Modes and Strategies
Open Source Linux
Open Source Linux
Nov 13, 2021 · Operations

How to Build High‑Availability Load Balancing with Keepalived and HAProxy

This guide explains how to configure Keepalived and HAProxy on Linux to achieve software load balancing and high availability, covering installation, core features, VRRP-based failover, health checks, session persistence, SSL offloading, and traffic routing with practical configuration examples.

HAProxyLinuxhigh availability
0 likes · 25 min read
How to Build High‑Availability Load Balancing with Keepalived and HAProxy
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Nov 12, 2021 · Databases

Implementing High‑Availability PostgreSQL with Keepalived: Architecture, Setup, and Failover Procedures

This article explains how to use Keepalived together with PostgreSQL to build a two‑node high‑availability cluster, covering Keepalived's VRRP mechanism, host planning, installation steps, asynchronous master‑slave replication configuration, monitoring scripts, and detailed failover drills.

Database ReplicationVRRPfailover
0 likes · 20 min read
Implementing High‑Availability PostgreSQL with Keepalived: Architecture, Setup, and Failover Procedures
Baidu Geek Talk
Baidu Geek Talk
Nov 10, 2021 · Operations

How etcd Powers Scalable Service Governance: Raft, BoltDB, and Real‑World Practices

This article explores service governance fundamentals, examines why etcd’s Raft‑based consensus and BoltDB storage make it ideal for large‑scale systems, compares it with ZooKeeper and Consul, and shares Baidu’s practical architecture, performance tricks, and operational metrics for high‑availability, high‑performance service management.

BoltDBDistributed SystemsRaft consensus
0 likes · 23 min read
How etcd Powers Scalable Service Governance: Raft, BoltDB, and Real‑World Practices
Java Architect Essentials
Java Architect Essentials
Nov 8, 2021 · Operations

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High Availability

This article explains how to horizontally scale RabbitMQ clusters to achieve million‑message per second throughput, covering Google’s large‑scale test setup, sharding and consistent‑hash plugins, federation, high‑availability mirroring, reliability mechanisms, and practical deployment tips for production environments.

RabbitMQhigh availabilitysharding
0 likes · 23 min read
Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High Availability
Architect
Architect
Nov 8, 2021 · Operations

Designing High Availability for Canal Using Zookeeper: Distributed Locks and Watch Mechanism

This article explains how to achieve high availability for Canal by designing a Zookeeper‑based distributed lock and watch mechanism, covering primary‑backup role election, failure detection, thundering‑herd mitigation, fair locking, node types, watcher events, and practical Zookeeper applications such as service registration and configuration management.

CanalZooKeeperdistributed-lock
0 likes · 13 min read
Designing High Availability for Canal Using Zookeeper: Distributed Locks and Watch Mechanism
Alibaba Terminal Technology
Alibaba Terminal Technology
Nov 5, 2021 · Mobile Development

How Alipay’s Mobile Client Uses Fuzz Testing to Prevent Crashes

This article describes Alipay’s client‑side high‑availability strategy that combines offline risk mining, function‑interface “minesweeping”, RPC/config/jsapi checks, and automated fuzz testing on Android and iOS to detect and eliminate crash‑inducing bugs before release.

automationclient stabilityfunction interface
0 likes · 7 min read
How Alipay’s Mobile Client Uses Fuzz Testing to Prevent Crashes