Evolution of Project Architecture and Key Distributed System Concepts Explained
This article outlines the evolution of project architecture from single‑server MVC to RPC, SOA, and micro‑services, and provides clear explanations of core distributed‑system terms such as clusters, load balancing, caching, and flow control for developers.
In the era of high concurrency, massive data, NoSQL, and cloud computing, many have heard terms like cluster and load balancing, but not everyone has practical experience; this article briefly explains project architecture evolution and clarifies common high‑availability concepts.
1: Evolution of Project Architecture
ORM and MVC: Early architectures ran on a single server, which sufficed for small workloads. As business grew, the MVC pattern split the system into presentation, business, and data‑access layers, making development and maintenance easier.
RPC architecture: When a single server can no longer handle the load, services are decomposed and deployed on multiple machines that communicate via remote procedure calls.
Service Provider: runs on the server side, offering service interfaces and implementations.
Service Registry: runs on the server side, publishing local services as remote ones and managing them for consumers.
Service Consumer: runs on the client side, invoking remote services through proxy objects.
Common Java RPC frameworks include:
Dubbo
Spring Cloud
Thrift
SOA architecture: As the number of services grows, communication between them becomes complex. Service‑Oriented Architecture centralizes service management with a governance center, where services register and consumers discover them.
Micro‑services: A newer trend that splits business functionality into even finer‑grained, independently deployable services.
2: Terminology Explanation
1: Cluster
Cluster (Cluster): A loosely coupled multiprocessor system composed of independent computers that communicate via a network, enabling distributed computing and shared‑memory messaging.
Typical characteristics of large‑scale clusters include:
High Availability (HA): When the primary node fails, a backup automatically takes over to ensure uninterrupted service.
High‑Performance Computing (HP): Leveraging all nodes to perform parallel computation for tasks such as genome analysis or chemical simulations.
Load Balancing (LB): Distributing workload across nodes according to an algorithm to reduce pressure on any single server.
Common cluster types:
Load‑balance cluster: Analogous to four brothers sharing orders; the leader distributes new tasks based on each brother’s current load.
High‑availability cluster: Two brothers run a breakfast shop; one stands by to take over if the other cannot work (Active/Standby).
Active/Active (dual active) cluster: Both brothers serve simultaneously; if one fails, the other continues handling both roles.
High‑computing cluster: Ten brothers collaboratively build a complex piece of furniture to meet a tight deadline, representing parallel computation.
2: Load Balancing
HTTP redirect load balancing: The web server returns a 302 redirect with a new URL; the client follows the redirect, achieving load distribution but adding an extra request and potential SEO issues.
DNS load balancing: A domain name can resolve to multiple IP addresses; DNS distributes client requests among those IPs, often with geographic routing, though changes propagate slowly and control resides with the DNS provider.
Reverse‑proxy load balancing: A reverse‑proxy sits in front of web servers, caching resources and forwarding requests to backend servers based on load‑balancing algorithms, simplifying deployment but potentially becoming a performance bottleneck.
Load‑balancing strategies:
Round Robin
Weighted Round Robin
Least Connections
Fastest Response
Hash
3: Caching
Caching stores data close to the compute resource to accelerate access; it is a primary technique for improving software performance.
CDN caching: Content Delivery Networks place static resources at edge servers near end users, reducing latency for high‑traffic assets such as video or portal content.
Reverse‑proxy caching: Deployed at the front of a website, a reverse proxy caches static resources, allowing it to serve requests without involving the application servers.
Local caching: Application servers keep hot data in memory, enabling direct access without database queries.
Distributed caching: For massive data sets, a dedicated cache cluster stores frequently accessed data, and applications retrieve it over the network.
3: Flow Control (Traffic Control)
Traffic discard: Simple queues drop excess requests when resources are saturated; while straightforward, this can degrade user experience.
Using distributed message queues to asynchronously process requests can alleviate pressure on the main service.
---
For readers interested in deeper architectural discussions, a WeChat group for architects is available; contact details are provided in the original article.
Copyright notice: Content originates from the internet; all rights belong to the original author.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.