Operations 13 min read

How to Build High‑Availability and Scalable Architecture

This article explains practical techniques for designing high‑availability and scalable systems—covering entry, business, cache, and database layers—so developers can handle rapid user growth without service interruptions or degraded performance.

Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
How to Build High‑Availability and Scalable Architecture

Mobile Internet, cloud computing, and big data enable ideas to become products quickly; when user demand is captured accurately, user numbers can explode without years of careful operation, but such rapid growth creates technical challenges such as single‑machine failures and performance degradation.

The article shows how to incorporate high‑availability (HA) and scalability into early architecture design with modest cost.

How to Achieve High Availability

Entry Layer

The entry layer (e.g., Nginx, Apache) is the service gateway. Using a single IP creates a single point of failure; keepalived can provide HA by assigning a virtual “heartbeat” IP that moves between two machines when one fails, and DNS points to this virtual IP.

This approach may cause a 1‑2 second interruption and requires two machines even though only one is active, which can be wasteful for long‑connection services; clients may need to reconnect.

Keepalived has limitations: the two machines must be in the same subnet, internal services must listen on all IPs (or use iptables to block external access), and server utilization may drop, which can be mitigated by mixed deployment.

Both machines must be in the same network segment.

Internal services need to listen on all IPs; otherwise they won’t start when the virtual IP moves.

Server utilization can decline; consider mixed deployment to improve it.

A common mistake is binding two public IPs to the same domain name; if one machine fails, half of the users lose access, which is not true HA.

Besides keepalived, LVS can also provide entry‑layer HA, though it is more complex.

Business Layer

The business layer (PHP, Java, Python, Go, etc.) should be stateless, moving session and cache data to external stores. Sessions belong in a database or stable cache; caching results improves performance but must be shared across servers to avoid inconsistency.

When the business layer is stateless, a failed server is invisible to users because the load balancer routes traffic to remaining servers.

Cookie‑based sessions are possible but introduce security risks (key leakage, replay attacks) and are generally discouraged in favor of database or cache storage.

Cache Layer

Without a cache, high traffic quickly overwhelms databases like MySQL. Adding a cache layer offloads most requests, improving capacity.

Distribute the cache across multiple machines; if one cache node fails, only a fraction of traffic falls back to the database, preserving stability. In small deployments, cache and business layers can be mixed to save machines.

Database Layer

Database HA is typically achieved with software solutions such as MySQL master‑slave or master‑master replication, or MongoDB replica sets.

In summary, HA can be realized by using heartbeat IPs at the entry layer, keeping the business layer stateless, partitioning the cache layer, and employing master‑slave replication for databases; these can be deployed on just two servers for early‑stage projects.

How to Achieve Scalability

Entry Layer

Scalability at the entry layer is achieved by horizontally adding machines and updating DNS with additional IPs, though some browsers only use the first few IPs returned.

A recommended pattern is a small set of Nginx front‑ends exposing a virtual IP, with backend servers hidden on the internal network; clients can also perform their own load‑balancing for non‑HTTP services.

Business Layer

Scalable business layers follow the same stateless principle; adding more machines horizontally expands capacity.

Cache Layer

Cache scalability is trickier. One simple method is to shut down the entire cache cluster during low‑traffic periods, bring up a new cluster, and let it warm up while the database handles the interim load.

Strong‑consistency cache: cannot tolerate stale data (e.g., account balances).

Weak‑consistency cache: tolerates temporary inaccuracies (e.g., social media counters).

Immutable cache: keys map to values that never change (e.g., hashed passwords).

Weak‑consistency and immutable caches scale well using consistent hashing; strong‑consistency caches require careful handling of node changes to avoid stale or dirty reads.

To mitigate strong‑consistency issues, either never remove nodes or ensure node‑change intervals exceed data TTL, and update client hash configurations in a staged manner.

Redis (and projects like Codis) provide better scalability and HA features compared to older Memcached designs.

Database

Database scalability can be achieved through horizontal sharding, vertical sharding, and periodic rolling upgrades.

Overall, HA and scalability are addressed at four layers: entry (heartbeat for HA, parallel deployment for scaling), business (stateless services), cache (fine‑grained partitioning and consistent hashing), and database (replication for HA, sharding/rolling for scaling).

System ArchitecturescalabilityHigh Availabilityload balancingCachingDatabase Replicationstateless services
Art of Distributed System Architecture Design
Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.