Master Distributed Systems: CAP, BASE, Locks, Transactions, Paxos & Raft
This comprehensive guide explores core distributed system concepts—including the CAP theorem and its trade‑offs, BASE consistency, various distributed lock strategies, multiple transaction patterns such as 2PC, 3PC, TCC and Seata, as well as consensus algorithms Paxos and Raft, while also covering idempotency and rate‑limiting techniques.
CAP Principle
The CAP theorem states that a distributed system can satisfy at most two of the three guarantees: Consistency (all nodes see the same data), Availability (every request receives a response), and Partition tolerance (the system continues to operate despite network partitions).
Consistency : strict data uniformity across replicas.
Availability : service always responds, though data may be stale.
Partition tolerance : system remains functional when network partitions occur.
Why CAP Cannot Be Fully Achieved
Network partitions are inevitable in distributed environments. To tolerate partitions, a system must sacrifice either consistency or availability. If a partition occurs, guaranteeing both strong consistency and immediate availability is impossible because nodes cannot simultaneously agree on the latest state while responding to every request.
CAP Models and Real‑World Applications
Depending on which two properties are chosen, different system designs emerge:
CA (Consistency + Availability, without Partition tolerance) : feasible only in environments without partitions; not realistic for true distributed systems.
CP (Consistency + Partition tolerance, without Availability) : used by traditional relational databases and distributed transaction systems where strong consistency is paramount.
AP (Availability + Partition tolerance, without Consistency) : adopted by many NoSQL stores, web caches, and DNS where high availability is critical.
Examples of CP systems include ZooKeeper (CP) and Eureka (AP). Nacos supports both CP and AP modes.
BASE Theory
BASE (Basically Available, Soft state, Eventual consistency) extends CAP by relaxing strict consistency. It accepts that data may be temporarily inconsistent (soft state) but will eventually converge to a consistent state.
Distributed Lock Implementations
Three common approaches:
MySQL lock : create a lock table with a unique constraint; insert a row to acquire, delete to release. Simple but incurs DB I/O overhead.
ZooKeeper lock : create sequential child nodes under a lock node; the smallest sequence number holds the lock. Clients watch preceding nodes to know when the lock is released.
Redis lock : use SETNX (or SET resourceName value EX 5 NX in Redis 2.8+) to set a key with an expiration, ensuring automatic release on failure. Redisson provides a high‑level API and supports the RedLock algorithm.
Distributed Transaction Patterns
Key techniques to achieve atomicity across multiple services:
Two‑Phase Commit (2PC) : coordinator asks participants to prepare, then commits or rolls back based on all votes. Guarantees strong consistency but suffers from single‑point failure and blocking.
Three‑Phase Commit (3PC) : adds a pre‑commit phase to avoid coordinator blocking, introducing timeout handling for better fault tolerance.
TCC (Try‑Confirm‑Cancel) : business‑level two‑phase commit where each operation defines explicit try, confirm, and cancel actions, reducing database‑level lock contention.
Local Message Table : writes a message record in the same transaction as the business update; a background job polls the table and publishes to MQ.
MQ Transaction : sends a “prepare” message, executes local transaction, then commits or rolls back the message based on transaction outcome (e.g., RocketMQ half‑message).
Maximum Effort Notification : used for low‑latency eventual consistency scenarios such as payment notifications, with retry limits and fallback queries.
Seata Overview
Seata provides a non‑intrusive global transaction framework built on an enhanced two‑phase commit. Core roles:
TC (Transaction Coordinator) : manages global transaction state.
TM (Transaction Manager) : starts, commits, or rolls back transactions.
RM (Resource Manager) : registers branch transactions and executes commit/rollback commands from the TC.
The workflow: TM requests a global transaction ID (XID) from TC, RM registers branch transactions under that XID, services execute their local logic, and finally TM tells TC to commit or rollback the whole transaction.
Consensus Algorithms
Paxos
Paxos achieves distributed consensus via two phases: Prepare and Accept. Roles include Proposer, Acceptor, and Learner. A proposer sends a numbered prepare request; acceptors promise not to accept lower numbers and reply with the highest accepted proposal. If a majority of acceptors respond, the proposer sends an accept request with a value. Once a majority accepts, the value is learned.
Raft
Raft simplifies consensus by electing a single leader. Roles are Leader, Follower, and Candidate. Leaders send periodic heartbeats; followers start an election if they miss heartbeats. A candidate increments its term, votes for itself, and requests votes. Once a majority is obtained, it becomes leader and handles log replication.
Idempotency in Distributed Systems
Idempotency ensures that repeated identical requests produce the same effect. Common techniques include:
Pre‑check with SELECT before INSERT using a unique request identifier.
Unique constraints with exception handling (e.g., DuplicateKeyException).
Pessimistic locking ( SELECT ... FOR UPDATE).
Optimistic locking using version or timestamp columns.
Dedicated deduplication tables for one‑time identifiers.
State machines that restrict illegal state transitions.
Distributed locks (e.g., Redis‑based Redisson).
Token‑based request validation.
Rate‑Limiting Algorithms
Common algorithms to protect services from overload:
Counter : simple fixed‑window counting; suffers from burst spikes.
Leaky Bucket : enforces a constant outflow rate, buffering excess requests.
Token Bucket : generates tokens at a steady rate; each request consumes a token. Implemented in Java via Guava’s RateLimiter.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
