Understanding Distributed Systems: Scaling, Partitioning, and CAP Theory
This article explains the fundamentals of distributed systems, covering why they arise, vertical and horizontal scaling, system splitting, sharding and partitioning strategies, load balancing, replication, the CAP theorem, and the BASE model, all illustrated with practical examples and code snippets.
Programming is an art, and the author uses a conversational style with a character called "65哥" to introduce the concept of distributed systems, explaining that a distributed system consists of hardware or software components spread across multiple networked computers that communicate solely via message passing.
Web Application Scaling
Initially, a single server could handle a web application, but as traffic grows, vertical scaling (upgrading CPU, memory, bandwidth) reaches limits due to hardware bottlenecks and rising marginal costs. When vertical scaling becomes unsustainable, horizontal scaling—adding more servers—is introduced.
Horizontal scaling increases concurrency by deploying multiple identical nodes that collectively handle requests.
System Splitting
Horizontal scaling alone is insufficient; the system must be split across nodes. Two types of splitting are discussed:
垂直拆分 (vertical splitting): dividing the system by modules or roles, each handling different responsibilities.
水平拆分 (horizontal splitting): deploying multiple identical instances of the whole system, each processing a complete request.
Horizontal splitting creates a 集群 (cluster), while vertical splitting results in a truly 分布式 (distributed) architecture.
Distributed Goals
The article lists the design goals of distributed systems, including transparency (access, location, concurrency, replication, fault, mobility, performance, scalability), openness, scalability, performance, and reliability.
Distributed Challenges
Key challenges stem from node failures and unreliable networks, requiring mechanisms for fault detection, task migration, and handling network partitions, latency, packet loss, and ordering issues.
Divide and Conquer
Distributed systems achieve higher performance by dividing work and data across many resources, following the "divide and conquer" principle. Examples include MapReduce's map phase, Elasticsearch sharding, and Kafka partitions.
Sharding
Sharding ( 数据分片 ) distributes data across multiple nodes, as seen in MongoDB and Elasticsearch, enabling horizontal scaling.
Partition
In Kafka, a topic consists of multiple partition s, each processed by a consumer group, providing parallelism.
Load Balancing
Load balancers such as Nginx, Dubbo, and Spring Cloud's Ribbon distribute requests across servers to improve performance and reliability.
Partition Strategies
Two main strategies are described:
可复刻 (deterministic): the same input always maps to the same node, useful for stateful data.
不可复刻 (non‑deterministic): random or hash‑based distribution, providing better load distribution but less predictability.
Dubbo Load Balancing
Random
RoundRobin
LeastActive
ConsistentHash
Kafka Partition Assignors
RangeAssignor – evenly distributes partitions based on consumer order.
RoundRobinAssignor – balances partitions across all consumers.
StickyAssignor – minimizes movement between rebalances while keeping partitions balanced.
Replication
Replication improves high availability and read/write throughput. Common patterns include Master‑Slave, Leader‑Follower, and Primary‑Replica. Examples are MySQL master‑slave setups and Elasticsearch's primary and replica shards.
CAP Theory
The CAP theorem states that a distributed system can satisfy at most two of the three properties: Consistency, Availability, and Partition tolerance. The article explains each property and shows the three possible trade‑off combinations (CA, CP, AP).
BASE Theory
BASE (Basically Available, Soft state, Eventually consistent) relaxes the strict guarantees of ACID to achieve high availability and scalability in large distributed systems. The article describes each component and contrasts it with strong consistency.
Overall, the article encourages readers to study these theories and apply them in real projects, promising future articles on distributed consistency.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.