Mastering Distributed Transactions: From Fundamentals to Advanced Solutions
This comprehensive guide explains the core concepts of transactions, local and distributed transaction models, the CAP theorem, various distributed transaction solutions such as 2PC, TCC, reliable messaging, and maximum effort notifications, and compares their trade‑offs for modern microservice architectures.
Distributed Transactions
1. Basic Concepts
1.1 What Is a Transaction
A transaction is like buying something at a store: payment and delivery must both succeed, otherwise the whole operation is rolled back.
In computing, a transaction is a large activity composed of smaller actions that must either all succeed or all fail.
1.2 Local Transactions
Local transactions are typically managed by relational databases using the database's own transaction features. Because the application and database often reside on the same server, these are called local (or database) transactions.
Four ACID properties:
Atomicity : all operations succeed or none do.
Consistency : data constraints remain valid before and after the transaction.
Isolation : concurrent transactions do not interfere; isolation levels prevent dirty reads, repeatable reads, etc.
Durability : once committed, changes are persisted and cannot be rolled back.
1.3 Distributed Transactions
With the shift from monolithic to micro‑service architectures, a single business operation may involve multiple services that must coordinate over the network, forming a distributed transaction (e.g., user registration with points, order creation with inventory deduction, bank transfers).
In a distributed environment, a local transaction such as "decrease Zhang's balance" may succeed while a remote call to increase Li's balance fails, leading to data inconsistency.
1.4 Scenarios That Generate Distributed Transactions
Micro‑service architecture: services call each other remotely (e.g., order service calls inventory service).
Monolithic system accessing multiple database instances (e.g., user info and order info stored in separate MySQL instances).
Multiple services accessing the same database instance (each holds its own connection, causing cross‑JVM coordination).
Fundamental Theory of Distributed Transactions
2.1 CAP Theory
CAP stands for Consistency, Availability, Partition tolerance. In a distributed system you can only guarantee two of the three.
C – Consistency : after a write, all reads see the latest data.
A – Availability : every request receives a response (no timeout or error).
P – Partition tolerance : the system continues to operate despite network partitions.
CAP Combination Modes
AP : sacrifice consistency for availability and partition tolerance (e.g., eventual consistency).
CP : sacrifice availability for consistency and partition tolerance (e.g., strong consistency systems like Zookeeper).
CA : sacrifice partition tolerance; typical single‑node relational databases.
2.2 BASE Theory
BASE expands on AP: Basically Available, Soft state, Eventually consistent. It describes "soft" transactions that tolerate temporary inconsistency.
Basically Available : core functionality remains available even if some parts fail.
Soft state : intermediate states are allowed (e.g., "payment in progress").
Eventually consistent : data converges to a consistent state after some time.
2PC (Two‑Phase Commit) Solution
3.1 What Is 2PC
2PC splits a transaction into a Prepare phase and a Commit phase. All participants must prepare successfully before the coordinator can commit.
begin transaction;
// 1. Local DB operation: decrease Zhang's balance
// 2. Remote call: increase Li's balance
commit transaction;3.2 Solutions
3.2.1 XA Scheme
Traditional databases (Oracle, MySQL) support XA, which implements 2PC at the DB layer. The Open Group defines a Distributed Transaction Processing (DTP) reference model with three roles:
AP (Application Program): the client using the transaction.
RM (Resource Manager): the database instance controlling a branch transaction.
TM (Transaction Manager): coordinates the global transaction.
Typical flow: the TM asks each RM to prepare; if any fails, TM tells all RMs to roll back; otherwise TM tells them to commit.
3.2.2 Seata Scheme
Seata (formerly Fescar) is an open‑source distributed‑transaction framework that moves the RM to the application side, reducing lock time. It defines three components:
TC (Transaction Coordinator): maintains global transaction state.
TM (Transaction Manager): embedded in the application, starts global transactions.
RM (Resource Manager): controls branch transactions.
Seata executes the commit in the first phase, releasing locks earlier and improving performance.
3.3 Summary
Seata’s 0‑intrusion design and early lock release make it a preferred 2PC implementation.
3PC (Three‑Phase Commit) – Not Covered
TCC (Try‑Confirm‑Cancel) Solution
4.1 What Is TCC
TCC requires each branch to implement three operations: Try (pre‑check and reserve resources), Confirm (finalize), and Cancel (rollback). The TM orchestrates these phases, retrying Confirm/Cancel if needed.
4.2 TCC Frameworks
Popular frameworks include Hmily, ByteTCC, EasyTransaction, and others. Hmily provides nested transaction support, uses AOP for interception, and stores logs in MySQL, Redis, MongoDB, etc.
// Example of a TCC try/confirm/cancel for account A
try: check balance, deduct 30
confirm: (no action needed)
cancel: add back 304.3 Summary
TCC offers high throughput but requires significant application changes and careful handling of idempotency, empty rollbacks, and hanging transactions.
Reliable Message (Eventually Consistent) Solution
5.1 What Is Reliable Message Consistency
The producer must ensure that a local transaction and the message send are atomic. The consumer must reliably receive and process the message, handling possible duplicates via idempotency.
5.2 Solutions
5.2.1 Local Message Table
Store messages in a local table within the same transaction as the business operation, then a scheduled task scans the table and sends pending messages to the broker.
5.2.2 RocketMQ Transactional Messages
RocketMQ supports transactional messages where the broker holds a "prepared" message until the producer reports commit or rollback, guaranteeing atomicity between local DB changes and message delivery.
public interface RocketMQLocalTransactionListener {
RocketMQLocalTransactionState executeLocalTransaction(Message msg, Object arg);
RocketMQLocalTransactionState checkLocalTransaction(Message msg);
}5.3 Summary
Reliable messaging solves the atomicity of local DB changes and message sending, and ensures consumer reliability, making it suitable for long‑running, low‑latency‑requirement scenarios such as registration‑bonus or coupon distribution.
Maximum Effort Notification
6.1 What Is Maximum Effort Notification
This approach tries hard to push a result notification to the receiver (e.g., payment result). If the push fails, the receiver can query the sender for the final status.
6.2 Solutions
Both solutions rely on MQ's ACK mechanism: either the receiver directly consumes the MQ message, or a notification service consumes the message and forwards it via HTTP to the receiver.
6.3 Summary
Maximum effort notification provides eventual consistency with high throughput and low implementation complexity, suitable for scenarios like bank or payment result notifications.
Comparison of Distributed Transaction Solutions
2PC
TCC
Reliable Message
Maximum Effort Notification
Consistency
Strong
Eventual
Eventual
Eventual
Throughput
Low
Medium
High
High
Implementation Complexity
Easy
Hard
Medium
Easy
In practice, choose the simplest solution that meets the business requirements; avoid over‑using distributed transactions when a single‑source local transaction suffices.
Author: 六脉神剑 – Source: juejin.cn/post/6844904003344531463 – © Original author.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
