Fundamentals 19 min read

Understanding Distributed Systems: Scaling, Partitioning, and CAP Theory

This article explains the fundamentals of distributed systems, covering why they arise, vertical and horizontal scaling, system splitting, sharding and partitioning strategies, load balancing, replication, the CAP theorem, and the BASE model, all illustrated with practical examples and code snippets.

Full-Stack Internet Architecture

Mar 18, 2021

Understanding Distributed Systems: Scaling, Partitioning, and CAP Theory

Programming is an art, and the author uses a conversational style with a character called "65哥" to introduce the concept of distributed systems, explaining that a distributed system consists of hardware or software components spread across multiple networked computers that communicate solely via message passing.

Web Application Scaling

Initially, a single server could handle a web application, but as traffic grows, vertical scaling (upgrading CPU, memory, bandwidth) reaches limits due to hardware bottlenecks and rising marginal costs. When vertical scaling becomes unsustainable, horizontal scaling—adding more servers—is introduced.

Horizontal scaling increases concurrency by deploying multiple identical nodes that collectively handle requests.

System Splitting

Horizontal scaling alone is insufficient; the system must be split across nodes. Two types of splitting are discussed: 垂直拆分 (vertical splitting): dividing the system by modules or roles, each handling different responsibilities. 水平拆分 (horizontal splitting): deploying multiple identical instances of the whole system, each processing a complete request.

Horizontal splitting creates a 集群 (cluster), while vertical splitting results in a truly 分布式 (distributed) architecture.

Distributed Goals

The article lists the design goals of distributed systems, including transparency (access, location, concurrency, replication, fault, mobility, performance, scalability), openness, scalability, performance, and reliability.

Distributed Challenges

Key challenges stem from node failures and unreliable networks, requiring mechanisms for fault detection, task migration, and handling network partitions, latency, packet loss, and ordering issues.

Divide and Conquer

Distributed systems achieve higher performance by dividing work and data across many resources, following the "divide and conquer" principle. Examples include MapReduce's map phase, Elasticsearch sharding, and Kafka partitions.

Sharding

Sharding ( 数据分片) distributes data across multiple nodes, as seen in MongoDB and Elasticsearch, enabling horizontal scaling.

Partition

In Kafka, a topic consists of multiple partition s, each processed by a consumer group, providing parallelism.

Load Balancing

Load balancers such as Nginx, Dubbo, and Spring Cloud's Ribbon distribute requests across servers to improve performance and reliability.

Partition Strategies

Two main strategies are described: 可复刻 (deterministic): the same input always maps to the same node, useful for stateful data. 不可复刻 (non‑deterministic): random or hash‑based distribution, providing better load distribution but less predictability.

Dubbo Load Balancing

Random

RoundRobin

LeastActive

ConsistentHash

Kafka Partition Assignors

RangeAssignor – evenly distributes partitions based on consumer order.

RoundRobinAssignor – balances partitions across all consumers.

StickyAssignor – minimizes movement between rebalances while keeping partitions balanced.

Replication

Replication improves high availability and read/write throughput. Common patterns include Master‑Slave, Leader‑Follower, and Primary‑Replica. Examples are MySQL master‑slave setups and Elasticsearch's primary and replica shards.

CAP Theory

The CAP theorem states that a distributed system can satisfy at most two of the three properties: Consistency, Availability, and Partition tolerance. The article explains each property and shows the three possible trade‑off combinations (CA, CP, AP).

BASE Theory

BASE (Basically Available, Soft state, Eventually consistent) relaxes the strict guarantees of ACID to achieve high availability and scalability in large distributed systems. The article describes each component and contrasts it with strong consistency.

Overall, the article encourages readers to study these theories and apply them in real projects, promising future articles on distributed consistency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Scalability CAP theorem load balancing Partitioning

Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.