Classic Solutions for Distributed Transactions: Theory and Practical Patterns
This article introduces the fundamental concepts of distributed transactions, explains ACID and BASE properties, and systematically presents classic solutions such as two‑phase commit, SAGA, TCC, local message tables, transaction messages, maximum‑effort notifications, and sub‑transaction barriers for reliable microservice architectures.
Fundamental Theory
Before discussing specific solutions, the article reviews the basic theory of distributed transactions, using a money transfer example to illustrate the need for atomic debit and credit operations that must either both succeed or both fail.
Transaction
A database transaction groups multiple statements into a single unit of work, guaranteeing the ACID properties: Atomicity, Consistency, Isolation, and Durability.
Distributed Transaction
In cross‑bank transfers, the operations span multiple databases, so a single local transaction cannot ensure ACID; a distributed transaction is required to coordinate the actions across nodes while balancing availability, performance, and consistency (BASE vs. ACID).
Solutions for Distributed Transactions
Two‑Phase Commit / XA
XA defines the interface between a global transaction manager (TM) and local resource managers (RM). It consists of a prepare phase where all participants lock resources, followed by a commit or rollback phase. Most mainstream databases (MySQL, Oracle, SQL Server, PostgreSQL) support XA.
When any participant fails to prepare, the TM instructs all prepared participants to roll back.
Characteristics: simple, easy to understand, but holds locks for a long time, reducing concurrency.
SAGA
SAGA splits a long transaction into a series of local short transactions coordinated by a saga orchestrator. If any step fails, compensating actions are executed in reverse order.
Advantages: high concurrency, no long‑lasting locks. Disadvantages: requires explicit compensation logic, weaker consistency (e.g., debit may succeed while final transfer fails).
TCC (Try‑Confirm‑Cancel)
TCC, proposed by Pat Helland, consists of three phases: Try (resource reservation), Confirm (final execution, must be idempotent), and Cancel (resource release, also idempotent).
In a transfer scenario, Try reserves the amount, Confirm deducts it, and Cancel releases the reservation.
Characteristics: high concurrency, strong consistency, but higher development effort due to three interfaces.
Local Message Table
Originally described by Dan Pritchett (eBay), this pattern stores pending tasks in a local message table within the same transaction as the business operation, ensuring atomicity between business logic and message creation.
Features: simple decomposition of long transactions, but requires extra table management and polling.
Transaction Message
RocketMQ’s transactional message extends the local message table concept to a message broker, guaranteeing atomicity between message sending and local transaction execution.
The workflow includes sending a half‑message, executing the local transaction, then committing or rolling back the message based on the transaction outcome.
Maximum‑Effort Notification
This pattern strives to deliver business results to the receiver, using retry intervals and allowing the receiver to query the result if delivery fails, shifting reliability responsibility to the receiver.
AT Transaction Mode
Used in Seata (also called FMT), AT resembles XA but automates rollback without requiring explicit compensation code, at the cost of long‑lasting locks.
Network Exceptions in Distributed Transactions
Common issues include empty rollbacks, idempotency violations, and hanging (Cancel executed before Try). The article illustrates these problems with a TCC sequence diagram.
Sub‑Transaction Barrier
DTM introduces a barrier table (sub_trans_barrier) that records the global transaction ID, branch ID, and phase (try|confirm|cancel). By inserting unique keys, it prevents empty rollbacks, enforces idempotency, and avoids hanging.
func ThroughBarrierCall(db *sql.DB, transInfo *TransInfo, busiCall BusiFunc)Developers place their business logic inside busiCall ; the barrier ensures the call is executed only when appropriate, handling empty rollbacks, idempotency, and hanging automatically.
Summary
The article covered the fundamentals of distributed transactions, compared classic solutions (XA, SAGA, TCC, local/message tables, AT), discussed network‑related anomalies, and presented the sub‑transaction barrier technique as an elegant way to simplify reliable distributed transaction handling.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.