Why We Skipped TCC: Using Compensation and Local Message Tables for Distributed Transactions
This article explains common distributed‑transaction solutions, describes the author’s own compensation‑plus‑local‑message‑table approach for merging customers across databases, outlines the TCC model with its Try‑Confirm‑Cancel phases, and details why TCC was rejected due to complexity, long resource locks, and rollback issues.
Preface
Hello, I am Tianluo. I share an Alibaba interview question: How do you solve distributed transactions? Why didn’t you consider TCC at the time? What are TCC’s drawbacks?
1. How we solve distributed transactions
Industry solutions mainly include:
Two‑phase commit
Three‑phase commit
TCC
Local message table
Saga transaction
RocketMQ transactional messages
In our project we used a compensation + local message table approach.
Requirement background
Regulatory requirements demand merging customer A into customer B. Because we shard by customer ID, A and B may reside in different databases, creating a distributed‑transaction scenario.
We merge by backing up A’s data, deleting it, replacing A’s ID with B’s, and inserting the data into B’s database.
Core process
1. Main transaction (A‑database):
Step 1: Backup A’s data to a local table (local transaction).
Step 2: Delete A’s original data (local transaction).
Step 3: Record the merge relation “to be merged into B” for later compensation or retry.
2. Sub transaction (B‑database):
Step 4: Consume the local message and insert B‑customer data.
Step 5: If insertion fails, trigger compensation: restore A’s data and delete any dirty B data.
Compensation logic
If Step 4 fails (e.g., B‑insert exception), a scheduled task finds the pending merge record and performs compensation: restore A’s data from the backup table and delete any inserted dirty data in B.
2. How TCC solves distributed transactions
TCC adopts a compensation mechanism whose core idea is to register a confirm and a cancel (undo) operation for each business operation.
2.1 TCC basic flow
Try phase : attempt execution, perform consistency checks, and reserve required resources.
Confirm phase : commit the business without further checks, assuming Try succeeded.
Cancel phase : if the business fails, release resources reserved in Try and roll back the Confirm actions.
2.2 TCC example: user buying a gift
Assume user A has 100 coins and 5 gifts. A spends 10 coins to order 10 roses. Balance, order, and gifts are stored in different databases.
Try phase :
Generate an order record with status “pending confirmation”.
Freeze 10 coins, reducing available balance to 90.
Increase the user’s gift count from 5 to 15 (pre‑increase 10).
If Try succeeds, proceed to Confirm; any exception triggers Cancel.
Confirm phase :
Update order status to “paid”.
Set frozen coins to 0 and confirm balance 90.
Update gift count to 15 and clear pre‑increase.
If any exception occurs, move to Cancel; otherwise the transaction ends successfully.
Cancel phase :
Set order status to “canceled”.
Restore user balance to 100.
Reset gift count to 5.
3. Why we didn’t consider TCC and its drawbacks
We avoided TCC because the customer‑merge scenario does not fit well; the Try phase would be hard to define. Additionally, TCC has obvious drawbacks:
3.1 Development complexity and intrusiveness
Each participating service must explicitly implement three interfaces (Try, Confirm, Cancel), forcing a split of what could be a single atomic operation into three stages, increasing code complexity and error‑proneness.
3.2 Long resource lock time
During the window between a successful Try and the final Confirm/Cancel, reserved resources (e.g., frozen funds, locked inventory) remain unavailable, reducing overall system utilization and concurrency.
3.3 Empty rollback and hanging problems
Empty rollback : a Cancel is received even though Try never executed; the solution is to check for a corresponding Try record before performing Cancel actions.
Hanging : Try succeeds but its result arrives after a Cancel has already been issued, leading to a stale Cancel. The fix is to check for existing Cancel records before executing Try.
Empty rollback – Cancel without Try (need Cancel guard).
Hanging – Try arrives later than Cancel (need Try guard).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
