Why Distributed Transactions Still Matter: Strategies Beyond 2PC
This article explores the challenges of distributed transactions in microservice architectures, explains consistency theories like CAP and BASE, compares classic 2PC with eBay's event‑queue approach, TCC compensation, and cache‑based eventual consistency, and offers practical guidance for choosing the right solution.
According to microservice pioneer Martin Fowler, distributed transactions should be avoided when possible, yet they remain unavoidable in many domains.
In engineering, discussions focus on strong consistency versus eventual consistency solutions.
Typical approaches include Two‑Phase Commit (2PC), eBay's event‑queue method, TCC compensation, and cache‑based eventual consistency.
Consistency Theory
Distributed transactions aim to ensure data consistency across databases, but cross‑database transactions face issues like permanent node failures, making ACID guarantees unrealistic.
The CAP theorem reminds us to balance consistency, availability, and partition tolerance.
2PC provides classic strong consistency but suffers from poor scalability; eBay architect Dan Pritchett introduced the BASE theory to address large‑scale consistency challenges.
BASE suggests sacrificing strong consistency at any moment to gain scalability.
01. CAP Theory
In distributed systems, consistency, availability, and partition tolerance can satisfy at most two simultaneously; partition tolerance is essential.
Consistency: whether data across nodes is strongly consistent.
Availability: whether the service remains responsive within a bounded time.
Partition Tolerance: tolerance to network partitions.
Examples: Cassandra, Dynamo prioritize AP; HBase, MongoDB prioritize CP.
02. BASE Theory
Key ideas:
Basically Available : tolerate some loss of availability during failures to keep core services running.
Soft State : allow intermediate states that do not affect overall availability.
Eventual Consistency : all replicas converge to the same state over time.
Consistency Models
Data consistency models are classified into:
Strong Consistency : all replicas reflect updates instantly, usually via synchronous mechanisms.
Weak Consistency : no guarantee of immediate visibility of updates.
Eventual Consistency : a form of weak consistency where updates eventually become visible.
These models can be analyzed using the Quorum NRW algorithm.
Distributed Transaction Solutions
01. 2PC Solution – Strong Consistency
2PC records transaction phases in logs, enabling recovery after component crashes. If the coordinator restarts, logs reveal whether the transaction is in the Prepare or Prepare‑All state, guiding rollback or commit actions.
Three problems of 2PC:
Blocking synchronization.
Data inconsistency.
Single point of failure.
3PC improves by adding timeout mechanisms and an extra preparation phase, though it still has drawbacks; protocols like Paxos or Raft can eliminate inconsistency at the protocol level.
02. eBay Event‑Queue Solution – Eventual Consistency
Dan Pritchett’s approach stores tasks in messages or logs for asynchronous execution, requiring idempotent service interfaces and retry mechanisms.
Example scenario: a transaction updates both a transaction table and a user table; the solution wraps these updates in a local transaction and records completed operations in an updates_applied table to avoid duplicate processing.
The core is second‑phase retry with idempotent execution, ensuring eventual consistency.
03. TCC (Try‑Confirm‑Cancel) Compensation Model – Eventual Consistency
In a microservice chain A→B→C→D, if D fails, B and C must be rolled back via compensating operations to maintain data consistency.
Key requirements:
Record the service call chain.
Each service provides opposite business logic for compensation, with idempotent rollback.
Execute different rollback strategies based on failure reasons.
Challenges include the difficulty of creating a universal compensation scheme due to diverse business logic and parameters.
04. Cache Data Eventual Consistency
Caches (Redis, Memcached) sit in front of databases to reduce I/O. Stale cache data can cause inconsistency, e.g., after a product update.
Two solutions:
Set cache expiration; after expiry, fetch fresh data from the database.
Invalidate cache immediately after updating the database, forcing subsequent reads to retrieve fresh data.
Selection Recommendations
When facing consistency issues, assess business tolerance for strong, weak, and eventual consistency, then choose a solution based on the specific scenario.
In practice, distributed transactions are often unavoidable; 2PC remains a viable option if no better alternative exists.
For e‑commerce or financial transactions, middleware‑level 2PC can hide business logic, making failure handling difficult; permanent node failures may prevent compensation.
Financial systems demand absolute data control; consider TCC or message‑queue‑based flexible transaction frameworks, implemented at the business layer using SOA frameworks like Dubbo or Spring Cloud.
Source: http://www.uml.org.cn/wfw/201709292.asp
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
