Tagged articles

redundancy

43 articles · Page 1 of 1

Mar 27, 2026 · Operations

How to Build a Rock‑Solid High‑Availability Architecture: Redundancy, Defense, and Smooth Deployments

This article breaks down high‑availability architecture into redundancy, defensive degradation, and release mechanisms, offering concrete techniques, real‑world failure case studies, and step‑by‑step configurations to ensure continuous service even under heavy load or component failures.

CI/CDHigh AvailabilityKubernetes

0 likes · 16 min read

How to Build a Rock‑Solid High‑Availability Architecture: Redundancy, Defense, and Smooth Deployments

Su San Talks Tech

Jul 7, 2025 · Operations

Mastering High Availability: Redundancy & Automatic Failover in Modern Internet Architecture

This article explains how to achieve high availability in internet systems by designing redundant components and automatic failover mechanisms across layers such as load balancers, reverse proxies, microservices, middleware, databases, and messaging, illustrating concepts with diagrams of architectures, clustering, leader election, and practical tools like keepalived, Zookeeper, Redis Sentinel, and Kafka.

MicroservicesOperationsfailover

0 likes · 19 min read

Mastering High Availability: Redundancy & Automatic Failover in Modern Internet Architecture

IT Architects Alliance

Jan 6, 2025 · Operations

Ensuring High Reliability in Distributed Systems: Redundancy, Fault Detection, Replication, and Resilience Strategies

The article explores how distributed systems achieve high reliability through redundant design, precise fault detection and recovery, data replication and synchronization, coordinated fault tolerance and load balancing, distributed transaction handling, comprehensive monitoring, elastic scaling, security safeguards, and robust disaster‑recovery planning.

Reliabilityfault tolerancemonitoring

0 likes · 18 min read

Ensuring High Reliability in Distributed Systems: Redundancy, Fault Detection, Replication, and Resilience Strategies

IT Architects Alliance

Dec 29, 2024 · Operations

Design Principles and Key Technologies for High‑Availability Systems

The article explains why 24/7 high‑availability systems are essential for modern enterprises and details core design principles, layered architecture, and critical technologies such as redundancy, load balancing, caching, elastic scaling, monitoring, and fault‑tolerance to ensure continuous, reliable service.

Cloud ComputingHigh AvailabilitySystem Design

0 likes · 23 min read

Design Principles and Key Technologies for High‑Availability Systems

Cognitive Technology Team

Nov 15, 2024 · Operations

Building Redundancy in Applications to Avoid Single Points of Failure

The article explains how to design resilient applications by identifying critical paths, adding redundant components, using formulas for overall availability, and applying best‑practice recommendations such as multi‑zone/region deployment, load‑balanced VMs, database replication, and thorough testing of failover mechanisms.

High Availabilitycloud architectureload balancing

0 likes · 6 min read

Building Redundancy in Applications to Avoid Single Points of Failure

Mike Chen's Internet Architecture

May 16, 2024 · Operations

High Availability Architecture: Eight Common Solutions for Large‑Scale Websites

This article explains the concept of high‑availability architecture and details eight practical solutions—including redundant servers, load balancers, data backup strategies, security measures, redundancy patterns, automated operations, and monitoring/alert systems—to help large‑scale sites achieve continuous, fault‑tolerant service.

load balancingredundancy

0 likes · 7 min read

High Availability Architecture: Eight Common Solutions for Large‑Scale Websites

Top Architecture Tech Stack

Nov 26, 2023 · Operations

Understanding High Availability and High Performance: Complexity, Redundancy, and Decision Strategies

This article examines the inherent complexity of achieving high availability and high performance in distributed systems, explaining redundancy techniques, storage consistency challenges, various state‑decision models, and the trade‑offs involved in scaling single‑machine and cluster architectures.

High AvailabilitySystem Designdistributed systems

0 likes · 27 min read

Understanding High Availability and High Performance: Complexity, Redundancy, and Decision Strategies

Open Source Linux

Oct 25, 2023 · Operations

Scaling Servers for Millions: Load Balancing, Sharding, CDN Strategies

This guide explains how to design and expand server infrastructure to handle millions of concurrent users by using load balancers, database sharding, caching, CDNs, hardware selection criteria, and redundancy techniques, ensuring high availability and performance 24/7.

CDNdatabase shardingload balancing

0 likes · 11 min read

Scaling Servers for Millions: Load Balancing, Sharding, CDN Strategies

Architects' Tech Alliance

Jul 5, 2023 · Fundamentals

Comprehensive Overview of RAID: Concepts, Implementations, and Practical Applications

This article provides a thorough introduction to RAID technology, explaining its definition, various software and hardware implementations, their advantages and disadvantages, and practical guidance for selecting the most suitable RAID solution for different storage scenarios.

Data ProtectionRAIDSoftware RAID

0 likes · 11 min read

Comprehensive Overview of RAID: Concepts, Implementations, and Practical Applications

Open Source Linux

Apr 11, 2023 · Operations

Boost Network Reliability: Link Aggregation, Switch Stacking, and HSRP Explained

This article introduces the concept of link aggregation for combining multiple data channels into a higher‑bandwidth logical link, demonstrates configuration steps on Cisco‑style switches, explains switch stacking for increased port capacity and redundancy, and outlines HSRP hot‑standby routing to ensure continuous network availability.

EtherChannelHSRPlink aggregation

0 likes · 7 min read

Boost Network Reliability: Link Aggregation, Switch Stacking, and HSRP Explained

Laravel Tech Community

Oct 13, 2022 · Backend Development

Designing a Scalable Backend for a Nationwide ID Query Service

The article outlines a simple yet scalable backend architecture that can handle 20 million daily ID queries by partitioning a billion‑record dataset across multiple 16 GB virtual machines, using direct‑index lookups, modest bandwidth, and basic redundancy mechanisms to achieve ample performance headroom.

distributed systemsredundancyscalability

0 likes · 6 min read

Designing a Scalable Backend for a Nationwide ID Query Service

dbaplus Community

Oct 8, 2022 · Operations

Designing High‑Availability Internet Architecture: Redundancy and Automatic Failover

This article explains how to achieve high availability in internet systems by layering architecture, using redundancy and automatic failover across access, proxy, microservice, middleware, and storage components, and discusses practical techniques, common pitfalls, and operational safeguards for resilient services.

Automatic FailoverMicroservicesOperations

0 likes · 19 min read

Designing High‑Availability Internet Architecture: Redundancy and Automatic Failover

Programmer DD

Oct 8, 2022 · Fundamentals

Eight Timeless Computer Architecture Principles Every Designer Should Know

This article outlines eight enduring ideas—from designing for Moore's Law and using abstraction to speeding up common cases, leveraging parallelism, pipelining, prediction, memory hierarchy, and redundancy—that have shaped computer architecture over the past six decades.

CachingComputer ArchitectureMoore's Law

0 likes · 11 min read

Eight Timeless Computer Architecture Principles Every Designer Should Know

Liangxu Linux

Sep 22, 2022 · Operations

How to Choose the Right Server: Key Specs and Bandwidth Calculations

This article explains how to select a server by reviewing popular brands and detailing essential parameters such as bandwidth capacity, CPU characteristics, chipset architecture, memory requirements, storage options, network interfaces, redundancy features, and scalability considerations, including a practical bandwidth‑to‑online‑users calculation.

Bandwidth CalculationCPU specsNetwork Card

0 likes · 10 min read

How to Choose the Right Server: Key Specs and Bandwidth Calculations

Open Source Linux

Aug 15, 2022 · Operations

How to Choose the Right Server: Key Brands and Essential Specs Explained

This guide explains what a server is, lists the most common server brands, and walks through the crucial hardware parameters—bandwidth, CPU, chipset, memory, storage, network cards, redundancy, hot‑swap and scalability—helping you make an informed server purchase decision.

Bandwidth CalculationCPU specsServer Hardware

0 likes · 9 min read

How to Choose the Right Server: Key Brands and Essential Specs Explained

Architect's Alchemy Furnace

May 10, 2022 · Operations

How to Build Truly High‑Availability Systems: Redundancy, Failover, and Layered Architecture

High availability (HA) is essential for distributed systems, requiring redundancy and automatic failover across each architectural layer—from client to proxy, gateway, business logic, cache, and storage—to minimize downtime, achieve desired “nines” of uptime, and prevent cascading failures such as service snowballing.

distributed systemsfailoverredundancy

0 likes · 14 min read

How to Build Truly High‑Availability Systems: Redundancy, Failover, and Layered Architecture

IT Services Circle

Feb 19, 2022 · Operations

High Availability Design in Internet Architecture: Redundancy and Automatic Failover

This article explains the principles of high availability in internet systems, covering redundancy, automatic failover, availability metrics, and detailed HA designs for each architectural layer such as load balancers, microservices, middleware, and databases.

distributed systemsload balancingredundancy

0 likes · 20 min read

High Availability Design in Internet Architecture: Redundancy and Automatic Failover

Architects' Tech Alliance

Dec 29, 2021 · Fundamentals

Core Switch vs. Regular Switch: Differences, Advantages, and Key Technologies

This article explains how core switches differ from ordinary switches in port count, network layer placement, performance features such as large cache, high capacity, virtualization, TRILL, FCOE, and how technologies like link aggregation, redundancy, stacking, and HSRP enhance data‑center reliability and scalability.

HSRPcore switchlink aggregation

0 likes · 10 min read

Core Switch vs. Regular Switch: Differences, Advantages, and Key Technologies

MaGe Linux Operations

Oct 8, 2021 · Operations

What Triggered Facebook’s 7‑Hour Global Outage? Inside the DNS and Backbone Failure

On October 4, a faulty maintenance command disabled Facebook’s DNS servers and severed its global backbone network, causing a seven‑hour outage that affected billions of users, highlighted architectural flaws, and underscored the need for redundant DNS and better operational safeguards.

DNS failureFacebook outagenetwork backbone

0 likes · 8 min read

What Triggered Facebook’s 7‑Hour Global Outage? Inside the DNS and Backbone Failure

Alibaba Cloud Developer

Jul 2, 2021 · Fundamentals

Why Data Loss Happens: Hidden CPU Silent Errors and How to Prevent Them

This article explains the concepts of data loss and corruption, outlines common bit‑flip sources in disks, memory, network and CPUs, describes how silent CPU data errors are discovered and verified, and presents multi‑layer design strategies—including redundancy, checksums, logging and recovery—to ensure data is neither lost nor corrupted.

CPU SDEStorage Reliabilitydata integrity

0 likes · 17 min read

Why Data Loss Happens: Hidden CPU Silent Errors and How to Prevent Them

Architects' Tech Alliance

Feb 8, 2021 · Fundamentals

Overview of UPS Technologies and Redundant Power System Architectures

This article explains the fundamentals of UPS systems, describing standby, line‑interactive, and double‑conversion topologies, their operational principles, economic mode based on ITIC curves, and various redundancy configurations such as parallel N+1, centralized and distributed bypass, and dual‑bus architectures for data‑center reliability.

Data CenterElectrical EngineeringUPS

0 likes · 10 min read

Overview of UPS Technologies and Redundant Power System Architectures

Huawei Cloud Developer Alliance

Jan 19, 2021 · Operations

How to Achieve High Availability: Metrics, Redundancy, and Circuit‑Breaker Strategies

This article explains system availability metrics, the inevitability of faults in distributed systems, and practical high‑availability designs such as redundancy, Zookeeper and Eureka clustering, and circuit‑breaker patterns to keep services reliably operational.

High Availabilitycircuit breakerdistributed systems

0 likes · 9 min read

How to Achieve High Availability: Metrics, Redundancy, and Circuit‑Breaker Strategies

FunTester

Dec 12, 2020 · Operations

Why Redundancy Is the Key to Effective Disaster Recovery in IT Systems

The article explains that disaster recovery for information systems relies on redundancy across hardware, energy, and data, classifies natural, human, and technical disasters, defines critical metrics such as RTO and RPO, and outlines the technologies, architectures, and maturity levels needed to ensure business continuity.

Disaster RecoveryRPORTO

0 likes · 29 min read

Why Redundancy Is the Key to Effective Disaster Recovery in IT Systems

IT Architects Alliance

Dec 2, 2020 · Operations

Understanding High Availability: Sources of Complexity and Decision Strategies

The article explains high availability as a source of system complexity, describing how redundancy, hardware and software failures, external disasters, and state‑decision mechanisms such as dictatorial, negotiated, and democratic approaches affect both compute and storage layers, and discusses trade‑offs like the CAP theorem.

CAP theoremHigh AvailabilitySystem Design

0 likes · 12 min read

Understanding High Availability: Sources of Complexity and Decision Strategies

Ctrip Technology

Nov 26, 2020 · Backend Development

Improving Stability of Ctrip's Shark Multilingual Platform: Caching and File Redundancy Practices

This article details the evolution of Ctrip's Shark multilingual platform, describing how caching strategies and file redundancy were introduced to reduce database load, improve response times, and enhance overall system stability for a large-scale internationalized service.

CachingMultilingualbackend

0 likes · 10 min read

Improving Stability of Ctrip's Shark Multilingual Platform: Caching and File Redundancy Practices

Architects' Tech Alliance

Jun 6, 2020 · Fundamentals

Core Switch vs. Regular Switch: Key Differences, Advantages, and Deployment Practices

The article explains what distinguishes core switches from ordinary switches, outlines their architectural roles, port and performance differences, and describes advanced features such as large buffers, high capacity, virtualization, TRILL, FCOE, link aggregation, redundancy, stacking, and HSRP for reliable data‑center networking.

HSRPcore switchlink aggregation

0 likes · 12 min read

Core Switch vs. Regular Switch: Key Differences, Advantages, and Deployment Practices

Efficient Ops

Mar 31, 2020 · Information Security

Can You Really Destroy Alipay’s Storage? Inside Financial Data Center Redundancy

This article explores the layered redundancy of financial data centers, explaining hot and cold backups, multi‑site architectures, power supply safeguards, fire‑suppression systems, and why simply attacking a single component is unlikely to cripple services like Alipay.

Backup StrategiesData Center Securityfinancial systems

0 likes · 9 min read

Can You Really Destroy Alipay’s Storage? Inside Financial Data Center Redundancy

Efficient Ops

Mar 18, 2020 · Fundamentals

Master RAID in One Minute: Quick Guide to Disk Array Types & Benefits

This concise guide explains RAID fundamentals, covering hardware vs. software implementations, various RAID levels, deployment methods, and the strengths and weaknesses of each configuration, all illustrated with clear diagrams for rapid comprehension.

RAIDSoftware RAIDdisk array

0 likes · 5 min read

Master RAID in One Minute: Quick Guide to Disk Array Types & Benefits

Programmer DD

Dec 8, 2019 · Operations

Can Your Money Survive a Bombed Alipay Server? Inside Data Center Redundancy

The article explores how Alipay’s financial data is protected through multi‑site data centers, hot and cold backups, and disaster‑recovery mechanisms, explaining why destroying a single server—or even multiple facilities—won’t instantly erase users’ funds, and outlining the lengths required to truly cripple the system.

Data CenterDisaster Recoverybackup

0 likes · 10 min read

Can Your Money Survive a Bombed Alipay Server? Inside Data Center Redundancy

IT Architects Alliance

Nov 20, 2019 · Information Security

How Vulnerable Is Alipay’s Data Center? A Deep Dive into Redundancy and Attack Vectors

The article examines Alipay’s data‑center architecture, redundancy schemes, backup strategies, power‑supply design, fire‑suppression systems and physical security measures, illustrating why destroying its storage is far more complex than simply “blowing up” a server.

AlipayData CenterFire Suppression

0 likes · 9 min read

How Vulnerable Is Alipay’s Data Center? A Deep Dive into Redundancy and Attack Vectors

Architecture Digest

Nov 16, 2019 · Operations

What Happens If Alipay’s Data Centers Are Physically Destroyed? A Deep Dive into Redundancy and Disaster Recovery

The article examines how Alipay’s financial data would survive a physical destruction of its servers by explaining multi‑site data center architectures, hot and cold backups, power redundancy, fire‑suppression systems, and the role of partner banks in data recovery, highlighting the extensive resilience measures in modern financial infrastructures.

AlipayData CenterDisaster Recovery

0 likes · 8 min read

What Happens If Alipay’s Data Centers Are Physically Destroyed? A Deep Dive into Redundancy and Disaster Recovery

Python Programming Learning Circle

Nov 10, 2019 · Operations

What Happens If Alipay’s Servers Are Bombed? Inside Data Center Redundancy

The article explains how financial platforms like Alipay protect user funds through multi‑site data centers, hot and cold backups, power redundancy, fire‑suppression systems, and strict location standards, showing why destroying a single server would not erase all stored money.

Data CenterDisaster RecoveryOperations

0 likes · 9 min read

What Happens If Alipay’s Servers Are Bombed? Inside Data Center Redundancy

Java Backend Technology

Nov 10, 2019 · Information Security

What Happens If Alipay’s Servers Are Destroyed? Inside Data‑Center Resilience

The article explains how Alipay’s financial system uses multi‑site, multi‑center architectures, hot‑standby, active‑active, and cold‑backup strategies, along with stringent A‑class data‑center standards, to ensure that even catastrophic physical attacks cannot erase users' money.

AlipayData CenterDisaster Recovery

0 likes · 9 min read

What Happens If Alipay’s Servers Are Destroyed? Inside Data‑Center Resilience

ITFLY8 Architecture Home

Mar 31, 2019 · Operations

Beyond Redundancy: The Real Secrets to Building Truly High‑Availability Systems

This article explains that high‑availability systems involve more than just redundant hardware or software; they require careful design, data consistency strategies, realistic SLA calculations, and disciplined engineering practices to achieve the coveted “nines” of uptime.

High AvailabilityReliabilitySLA

0 likes · 13 min read

Beyond Redundancy: The Real Secrets to Building Truly High‑Availability Systems

ITFLY8 Architecture Home

Jan 10, 2019 · Fundamentals

15 Universal Architecture Principles Every Software Engineer Should Follow

Discover 15 timeless architecture principles—from redundancy and rollback design to monitoring, horizontal scaling, and non‑intrusive components—that guide engineers in creating robust, scalable, and maintainable systems while balancing cost, risk, and future growth.

Software Architecturedesign principlesredundancy

0 likes · 12 min read

15 Universal Architecture Principles Every Software Engineer Should Follow

Java Backend Technology

Dec 27, 2018 · Operations

How to Calculate System Availability and Reach More ‘9’s in Your SLA

This article explains how to model system availability using serial and parallel components, calculate component and overall reliability with MTBF/MTTR formulas, and apply practical steps to monitor, add redundancy, and achieve higher SLA "nines" for improved service reliability.

MTBFMTTRSLA

0 likes · 10 min read

How to Calculate System Availability and Reach More ‘9’s in Your SLA

21CTO

Sep 26, 2017 · Operations

Why You Should Never Trust Any Component in Your System—and How to Protect It

In programming and operations, every element—from services and dependencies to requests, machines, data centers, power, networks, and humans—can fail unexpectedly, so you must assume distrust and implement defensive measures such as monitoring, redundancy, rate limiting, fallback strategies, backups, and automated deployment.

OperationsReliabilityfault tolerance

0 likes · 9 min read

Why You Should Never Trust Any Component in Your System—and How to Protect It

Architecture Digest

Jan 26, 2017 · Backend Development

Design and Operational Practices for Game Platform Backend Systems

The article outlines the architecture, distributed design, redundancy, monitoring, automation, and fault‑handling strategies employed in a game company's platform backend to ensure high availability and efficient daily operations.

Automationbackenddistributed systems

0 likes · 15 min read

Design and Operational Practices for Game Platform Backend Systems

Qunar Tech Salon

Jan 16, 2017 · Backend Development

Scalable Web Architecture and Distributed Systems

This article explains the key design principles, components, and techniques—such as availability, performance, reliability, scalability, cost, redundancy, partitioning, caching, proxies, indexing, load balancing, and queuing—required to build large‑scale, high‑performance, and fault‑tolerant web and distributed systems, illustrated with an image‑hosting example.

CachingScalable ArchitectureWeb Performance

0 likes · 37 min read

Scalable Web Architecture and Distributed Systems

Architecture Digest

Nov 28, 2016 · Databases

Fundamentals of Data Storage: Engines, Models, Transactions, Distributed Design, and Redundancy

This article explains the importance of data storage, describes single‑node storage engines and data models, outlines transaction and concurrency control, and covers distributed storage principles, CAP and FLP theorems, 2PC and Paxos protocols, as well as redundancy, backup, and failover mechanisms.

2PCCAP theoremDatabases

0 likes · 8 min read

Fundamentals of Data Storage: Engines, Models, Transactions, Distributed Design, and Redundancy

Efficient Ops

Oct 8, 2016 · Operations

How to Boost Server Resource Utilization: Strategies, Trade‑offs, and Metrics

This article explains why servers often run far below their theoretical capacity, defines the concept of highest usable resource utilization, and offers practical and advanced techniques—such as multithreading, workload consolidation, resource layering, and overselling—to improve utilization while weighing performance, cost, and reliability impacts.

OperationsPerformance OptimizationResource Efficiency

0 likes · 9 min read

How to Boost Server Resource Utilization: Strategies, Trade‑offs, and Metrics

Architecture Digest

Aug 22, 2016 · Operations

Understanding High‑Availability Systems: Design Principles, Technical Solutions, and SLA Measurement

This article explains the comprehensive concept of high‑availability systems, covering redundancy, failover, consistency challenges, various technical solutions, SLA definitions, and the organizational and engineering practices required to achieve multiple “9s” of availability.

High AvailabilityOperationsSLA

0 likes · 14 min read

Understanding High‑Availability Systems: Design Principles, Technical Solutions, and SLA Measurement

21CTO

Sep 12, 2015 · Fundamentals

Unlocking RAID: How Different Levels Balance Speed, Redundancy, and Cost

This article provides a comprehensive overview of RAID technology, explaining its purpose, the various standard levels from RAID 0 to RAID 6, hybrid configurations, non‑standard implementations like DRFS, and both software and firmware/driver based deployment methods.

FilesystemPerformanceRAID

0 likes · 13 min read

Unlocking RAID: How Different Levels Balance Speed, Redundancy, and Cost