Why Distributed Transactions Matter: From CAP to Saga and Beyond
This article explains why transactions are essential, traces their origin from early XA specifications, examines the CAP and BASE theories that expose challenges in distributed systems, and reviews practical solutions such as two‑phase commit, three‑phase commit, TCC, asynchronous messaging, Saga and Gossip protocols, highlighting their trade‑offs and when to apply each.
1. Why Transactions Are Needed
Transactions address the "vertical" consistency problem in distributed systems, ensuring that a series of operations behave as a single, atomic unit—much like coordinating the movements of a rope‑bound group of ants.
2. Origin of Transactions
The concept originates from database systems and was formalized by the XA specification, which defines a set of function prototypes (xa_ and ax_) for coordinating resource managers (RM) and transaction managers (TM). Two‑phase commit (2PC) is the classic protocol that uses this coordination.
3. Transaction Issues in Distributed Systems
When systems are split into many micro‑services, transaction problems are amplified. Two foundational theories describe the trade‑offs:
CAP Theory
Proposed by Eric Brewer in 2000, the CAP theorem states that a distributed system cannot simultaneously guarantee Consistency, Availability, and Partition‑tolerance.
Consistency : linear consistency (strong consistency).
Availability : every request receives a response within a bounded time.
Partition‑tolerance : the system continues to operate despite arbitrary message loss between nodes.
An illustrative example compares the three guarantees to trying to achieve bug‑free code, rapid delivery, and team collaboration simultaneously—only two can be fully satisfied.
BASE Theory
Introduced by Dan Pritchett, BASE (Basically Available, Soft state, Eventually consistent) relaxes linear consistency to improve scalability. It does not replace CAP but helps distinguish core from non‑core data handling.
Basically Available : the system remains functional even when some components fail.
Soft state : state may become temporarily inconsistent without affecting overall availability.
Eventually consistent : data will converge to a consistent state over time.
4. Distributed Transaction Solutions
Based on the two theories, several practical approaches exist:
Two‑Phase Commit (2PC)
A coordinator asks participants to prepare, then to commit. Participants lock resources during the prepare phase, which can cause blocking if the coordinator fails.
Three‑Phase Commit (3PC)
Introduces a "prepare‑to‑commit" step to reduce blocking and eliminate the single‑point coordinator failure. If a timeout occurs, the protocol can safely abort.
TCC (Try‑Confirm‑Cancel)
Uses local transactions to avoid a global coordinator, logging state to ensure recovery. It was popularized in China by Alibaba and is similar in effect to 3PC.
Asynchronous Messaging – Local Message Table
Inspired by eBay, this pattern splits a distributed transaction into a series of local transactions recorded in a database table.
Asynchronous Messaging – Non‑Transactional MQ
When the message broker does not support transactions, the application can place the MQ send after the local commit. Example pseudocode:
try{
beginTrans();
modifyLocalData1();
modifyLocalData2();
deliverMessageToMQ();
commitTrans();
} catch(Exception ex){
rollbackTrans();
}Asynchronous Messaging – Transactional MQ
RocketMQ is currently the only open‑source MQ that offers true transactional support.
Saga
A long‑running transaction broken into a series of local transactions with compensating actions for rollback. It can be implemented as a chain or with a central coordinator to avoid circular dependencies.
Gossip Protocol
Used for data replication, peer‑to‑peer topology, and failure detection; it exemplifies a BASE‑based solution that tolerates eventual consistency.
All these solutions share common goals: add retry and rollback mechanisms, ensure idempotent interfaces, and, when possible, achieve exactly‑once processing to reduce manual intervention.
5. Conclusion
Choosing the right distributed transaction strategy requires balancing consistency, availability, and performance. Architects should evaluate the specific workload, consider the trade‑offs highlighted by CAP and BASE, and adopt the simplest solution that satisfies business requirements while keeping the system scalable.
References
Distributed TP: The XA Specification, X/Open, 1991 – https://publications.opengroup.org/c193
Harvest, Yield, and Scalable Tolerant Systems, Eric Brewer et al., 1999 – https://cs.uwaterloo.ca/~brecht/servers/readings-new2/harvest-yield.pdf
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition‑Tolerant Web, Seth Gilbert & Nancy Lynch, 2002 – https://pdfs.semanticscholar.org/24ce/ce61e2128780072bc58f90b8ba47f624bc27.pdf
Base: An Acid Alternative, Dan Pritchett, 2008 – http://delivery.acm.org/10.1145/1400000/1394128/p48-pritchett.pdf
Consensus Protocols: Two‑Phase Commit, Henry Robinson, 2008 – http://the-paper-trail.org/blog/consensus-protocols-two-phase-commit/
Consensus Protocols: Three‑Phase Commit, Henry Robinson, 2008 – http://the-paper-trail.org/blog/consensus-protocols-three-phase-commit/
Life beyond Distributed Transactions: an Apostate’s Opinion, 2007 – https://cs.brown.edu/courses/cs227/archives/2012/papers/weaker/cidr07p15.pdf
Sagas, Hector Garcia‑Molina & Kenneth Salem, 1987 – https://www.cs.cornell.edu/andru/cs711/2002fa/reading/sagas.pdf
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
