Backend Development 10 min read

Evolution and Key Concepts of Distributed Architecture for Large-Scale Systems

This article traces the historical development of distributed architecture from early computers to modern large‑scale web systems, explains core concepts such as clustering, replication, and middleware, and outlines eight evolutionary stages that address scalability, reliability, and performance challenges.

Architecture Digest
Architecture Digest
Architecture Digest
Evolution and Key Concepts of Distributed Architecture for Large-Scale Systems

The first electronic computer, ENIAC, was created in 1946 at the University of Pennsylvania, marking the beginning of the computer era despite its large size and slow speed. Early computers followed the von Neumann model with five components: input, output, memory, arithmetic logic unit, and control unit, processing data, instruction, and control flows.

After ENIAC, IBM dominated the mainframe era. In 1964 IBM introduced the System/360, which ruled the 195‑60s mainframe market. Mainframe architectures later diverged into CISC‑based cheap personal PCs and RISC‑based high‑cost UNIX servers.

While mainframes offered high performance, stability, and security, they presented problems: high cost, complexity, single‑point failures, and the rise of affordable, powerful PCs.

In 2009 Alibaba launched the "remove IOE" initiative, aiming to replace IBM mainframes, Oracle databases, and EMC storage (collectively called IOE) with more scalable, cost‑effective solutions as the company’s traffic grew.

Key distributed‑system concepts are introduced:

Cluster : multiple identical servers handling the same workload, analogous to hiring several chefs to serve more customers.

Distributed : separating responsibilities (e.g., chefs and prep cooks) to specialize and improve efficiency; clusters may exist within a distributed system, but they are not identical.

Node : an independent process that follows a distributed protocol, typically a process on an operating system.

Replica mechanism : providing redundancy for data or services; data replicas are stored on multiple nodes, while service replicas use master‑slave relationships for high availability.

Middleware : software positioned between the OS and applications to simplify communication, I/O, and other services for developers.

The architecture of a mature large‑scale website evolves gradually as user volume and business functions increase. The typical evolution includes eight stages:

Stage 1 – Single‑application architecture : both application and database run on one server.

Stage 2 – Separation of application and database servers : as traffic grows, the two are deployed on separate machines to improve performance and fault tolerance.

Stage 3 – Application server cluster : when a single application server becomes a bottleneck, a cluster distributes requests, raising issues of request routing and session management.

Stage 4 – Database read/write separation : read queries are directed to replica (slave) databases while writes go to the master, requiring synchronization (e.g., MySQL master‑slave) and routing middleware such as MyCat.

Stage 5 – Introducing a search engine : to handle fuzzy queries and improve performance, a search engine is added, though it brings index‑building challenges.

Stage 6 – Caching layer : hot data is cached using Redis or Memcached; in some cases NoSQL databases like MongoDB replace relational stores.

Stage 7 – Database sharding : vertical sharding splits different business data into separate databases, while horizontal sharding splits a single table across multiple databases to overcome size limits.

Stage 8 – Application splitting : as business domains expand, the monolith is divided into subsystems (e.g., user, product, transaction). Shared code is abstracted into common modules to improve maintainability. Inter‑service communication uses RPC technologies such as WebService, Hessian, HTTP, or RMI.

Source: http://www.cnblogs.com/zj-blog/p/9082291.html . The content is reproduced with attribution; copyright belongs to the original author.

backenddistributed systemsArchitecturescalabilitycloud
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.