Operations 3 min read

Understanding High Availability: Compute and Storage Strategies Explained

This article defines high availability, explains why achieving four nines is a common goal, and categorizes HA into compute and storage solutions, detailing common architectures such as active‑passive, master‑slave, symmetric and asymmetric clusters, as well as various storage replication patterns.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Understanding High Availability: Compute and Storage Strategies Explained

High availability (HA) measures the degree to which a system remains operational; true 100% uptime is impossible, so most services aim for "four nines" (99.99% availability). The article uses a global power‑outage analogy to illustrate the concept.

Compute High Availability

Typical compute HA architectures are grouped into four patterns:

Active‑Passive (主备) – a primary node handles traffic while a standby node takes over on failure.

Active‑Slave (主从) – the primary processes requests and replicates state to one or more slaves that can serve read traffic or become primary if needed.

Symmetric Cluster (对称集群) – all nodes are peers, sharing load and state, allowing any node to replace another seamlessly.

Asymmetric Cluster (非对称集群) – a mix of specialized roles (e.g., front‑end load balancers and back‑end workers) that together provide redundancy.

Storage High Availability

Storage HA focuses on protecting data and includes several patterns:

Active‑Passive (主备) – a primary storage node writes data while a backup synchronizes and can take over.

Active‑Slave (主从) – similar to compute, with one node handling writes and others replicating for read‑only access or failover.

Active‑Passive/Active‑Slave Switch (主备/主从切换) – dynamic role switching between nodes based on health.

Master‑Master (主主) – two or more nodes accept writes simultaneously, requiring conflict resolution.

Clustered Storage – can be data‑centralized (all data stored in a single logical pool) or data‑dispersed (data spread across nodes, e.g., HDFS architecture).

Geographic Partitioning – data is partitioned by region (continent, country, city, or same‑city) to improve latency and resilience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilitysystem reliabilityInfrastructurestorage HAcompute HA
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.