Backend Development 20 min read

Designing Scalable Systems: From a Single Server to Multi‑Node Architecture

This article explains how to design and evolve a system from a single‑server deployment to a highly scalable architecture using vertical and horizontal scaling, load balancing, database replication, sharding, caching, CDN, and stateless design to support billions of users.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Designing Scalable Systems: From a Single Server to Multi‑Node Architecture

The author introduces a technical community and promises practical guidance for building systems that can support tens of billions of users, focusing on backend architecture and scalability.

1. Starting from Scratch

Initially, the application runs on a single server hosting both the web server (e.g., Apache or Tomcat) and a relational database (e.g., Oracle or MySQL). This simple setup has two major drawbacks: a database failure or a web‑server failure brings the whole system down, because there is no redundancy or fail‑over.

Using DNS to Resolve Hostnames

Clients first query a DNS service to obtain the IP address of the server. DNS is usually provided as a paid service by a hosting provider rather than being self‑hosted.

2. The Art of Scalability

As traffic, data volume, and user count grow, the system must be expanded either vertically or horizontally.

Vertical scaling ( 纵向扩展 , scale‑up) adds more CPU, memory, or storage to an existing machine, while horizontal scaling ( 横向扩展 , scale‑out) adds additional machines to a pool.

Vertical Scaling

Vertical scaling (also called "scale‑up") increases resources on a single server, such as adding RAM, CPUs, SSDs, more disks in a RAID array, or faster network interfaces. It is cost‑effective for small systems but has limits: hardware capacity, required downtime for upgrades, and higher cost of high‑end machines.

Vertical scaling also includes software optimizations like query tuning and code improvements. The opposite, vertical down‑scaling, removes resources.

When Multiple Servers Are Needed

When a single server can no longer handle the load, the application should be split into separate web and database servers, allowing independent tuning (e.g., more CPU for the web tier, more memory for the database tier) and independent scaling.

Horizontal Scaling

Horizontal scaling adds any number of additional servers or services to a resource pool. It requires architectural planning from the start because the system must distribute work across multiple nodes, often involving code changes to support parallelism and load distribution.

Horizontal down‑scaling (scale‑in) removes servers from the pool.

3. Load Balancers

A load balancer (hardware or software) distributes incoming traffic across multiple backend servers, improving response time and availability. Popular open‑source options include HAProxy and Nginx.

Load‑balancing algorithms include round‑robin, least connections, fastest response, weighted distribution, and IP hash. Load balancers can operate at Layer 4 (TCP) or Layer 7 (HTTP) of the OSI model.

4. Scaling Relational Databases

Relational databases (e.g., Oracle, MySQL) face challenges when scaling. Techniques include replication (master‑slave, master‑master), sharding, denormalization, and SQL tuning.

Master‑Slave Replication

Writes go to the master; changes are propagated to one or more slaves. If the master fails, slaves can serve reads but cannot accept writes until a new master is promoted.

Master‑Master Replication

All nodes act as both master and slave, synchronizing changes to keep data consistent. This provides higher availability but is limited by the write capacity of each node.

Federation (Functional Partitioning)

Separate databases handle different functional domains (e.g., forum, users, products), reducing cross‑traffic and allowing parallel writes.

Sharding

Data is split into multiple shards, each stored on a different server, improving manageability, performance, and load distribution.

Denormalization

Redundant data is stored in multiple tables to speed up reads at the cost of write complexity. Materialized views in PostgreSQL or Oracle can help keep redundant data consistent.

5. Choosing a Database

Two major families exist: SQL (relational) and NoSQL (non‑relational). SQL databases include MySQL, Oracle, PostgreSQL, etc. NoSQL databases are categorized as key‑value, document, column‑family, graph, and blob stores.

Key‑Value Stores

Data is stored as key ‑ value pairs. Examples: Redis, DynamoDB.

Document Stores

Data is stored as documents (e.g., JSON) within collections. Examples: MongoDB, CouchDB.

Wide‑Column Stores

Data is organized into column families rather than fixed tables. Examples: Cassandra, HBase.

Graph Databases

Data is modeled as nodes and edges, ideal for relationship‑heavy data. Examples: Neo4j, InfiniteGraph.

Blob Stores

Object storage accessed via APIs (e.g., Amazon S3, Azure Blob Storage).

6. Horizontally Scaling the Web Layer

Stateless architecture is recommended: move session state to a shared store (relational or NoSQL) so any web server can handle any request, allowing the load balancer to distribute traffic efficiently.

7. Advanced Concepts

Caching

Caches (in‑memory or distributed) reduce database load and latency by storing frequently accessed data.

Content Delivery Networks (CDN)

CDNs cache static assets (images, JS, CSS) at edge locations, delivering content from the nearest node to improve page load times.

Global Deployment

GeoDNS routes users to the nearest data center, enabling worldwide availability.

Putting It All Together

By iteratively applying these techniques—stateless design, load balancing, caching, multi‑region deployment, CDN, and database sharding—a system can scale to over a hundred million users.

8. Topics for Future Discussion

Combining sharding and replication.

Long polling vs. WebSockets vs. Server‑Sent Events.

Indexing and proxying.

SQL performance tuning.

Elastic computing.

system architecturescalabilityLoad Balancingdatabase replicationhorizontal scalingvertical scaling
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.