Backend Development 19 min read

Distributed Transaction Solutions: Theory, Patterns, and Practical Implementations

This article explains the fundamentals of distributed transactions, illustrates classic solutions such as two‑phase commit (XA), SAGA, TCC, local message tables, transactional messaging, and maximum‑effort notifications, and introduces a sub‑transaction barrier technique to handle network anomalies in microservice architectures.

Top Architect
Top Architect
Top Architect
Distributed Transaction Solutions: Theory, Patterns, and Practical Implementations
来源:segmentfault.com/a/1190000040321750

Basic Theory

Before discussing specific solutions, we review the basic concepts of distributed transactions. Using a money‑transfer example, user A transfers 100 units to user B, requiring A's balance to decrease by 100 and B's to increase by 100 atomically.

Transaction

A database transaction groups multiple statements into a single unit that either succeeds completely or fails completely, guaranteeing the ACID properties: Atomicity, Consistency, Isolation, Durability.

Atomicity : All operations succeed or none do; on error the transaction rolls back to the state before it began.

Consistency : The database remains in a valid state before and after the transaction (constraints are preserved).

Isolation : Concurrent transactions do not interfere with each other, preventing inconsistent reads.

Durability : Once committed, changes survive system failures.

Distributed Transaction

In cross‑bank transfers the operations span multiple databases, so a local transaction cannot guarantee ACID; a distributed transaction is required to ensure correct execution across nodes.

Distributed transactions relax some ACID constraints to improve availability and performance, following BASE principles (Basic Availability, Soft state, Eventual consistency) while still preserving essential ACID aspects such as atomicity and durability.

Solutions for Distributed Transactions

Two‑Phase Commit / XA

XA, defined by the X/Open group, specifies the interface between a global transaction manager (TM) and local resource managers (RM). Most mainstream databases (MySQL, Oracle, SQL Server, PostgreSQL) support XA.

The protocol has two phases: prepare (participants lock required resources and report readiness) and commit/rollback (TM decides and instructs participants).

Diagram omitted.

Simple and easy to understand, development is straightforward.

Locks resources for a long time, reducing concurrency.

Further reading: DTM for Go, Seata for Java.

SAGA

SAGA splits a long transaction into a series of local short transactions coordinated by a saga orchestrator. If any step fails, compensating actions are executed in reverse order.

Diagram omitted.

High concurrency, no long‑term locks.

Requires defining both forward and compensating actions, increasing development effort.

Weaker consistency; e.g., A's account may be debited while the overall transfer fails.

Further reading: DTM for Go, Seata for Java.

TCC (Try‑Confirm‑Cancel)

Proposed by Pat Helland (2007), TCC consists of three phases: Try (resource reservation and business checks), Confirm (final execution, must be idempotent), and Cancel (release reserved resources).

Diagram omitted.

Higher concurrency, no long locks.

More development effort due to three interfaces.

Better consistency; no partial debit without commit.

Suitable for order‑type business where intermediate state constraints are strict.

Further reading: DTM for Go, Seata for Java.

Local Message Table

Originated from an eBay architecture paper (2008). The idea is to store tasks that need distributed processing as messages, ensuring atomicity between the business operation and message creation within a single local transaction.

Diagram omitted.

Long transactions are broken into simple tasks.

Requires an extra message table on the producer side.

Each message table needs polling.

If the consumer cannot succeed after retries, additional rollback mechanisms are needed.

Applicable to asynchronous business where later steps do not need rollback.

Transactional Message

RocketMQ (>=4.3) provides transactional messages, embedding the local‑message‑table concept into the broker and guaranteeing atomicity between message send and local transaction.

Process: send a half‑message, execute the local transaction, then commit (make the message visible) or rollback.

Diagram omitted.

Simple to use; splits a long transaction into tasks with a check‑back interface.

Consumer must handle retry failures with additional rollback logic.

Further reading: RocketMQ documentation, DTM simple implementation.

Maximum‑Effort Notification

This pattern tries to deliver the business result to the receiver as much as possible, using retry mechanisms and a query interface for the receiver to pull results if notifications fail.

Requirements:

Provide an API for the receiver to query the processing result.

Implement an exponential‑backoff ACK schedule (1 min, 5 min, 10 min, 30 min, 1 h, 2 h, 5 h, 10 h) until a configured window expires.

AT Transaction Mode

Used in Alibaba's Seata (also called FMT). It resembles XA but does not require manual compensation; the framework automatically rolls back, though it still holds locks for a relatively long time.

Network Exceptions in Distributed Transactions

Various failures can occur during the two‑phase process, such as empty rollback, idempotency violations, and hanging (Cancel executed before Try). The article illustrates these cases with TCC examples and a sequence diagram.

Diagram omitted.

Sub‑Transaction Barrier

DTM introduces a sub‑transaction barrier that records branch status in a table (gid‑branch‑stage) to guarantee empty‑rollback control, idempotency, and hanging prevention.

Key steps:

Start a local transaction.

For a Try branch, insert a unique key (gid‑branch‑try); if successful, execute the protected logic.

For a Confirm branch, insert (gid‑branch‑confirm) and execute if insertion succeeds.

For a Cancel branch, insert (gid‑branch‑try) then (gid‑branch‑cancel); if the Try record does not exist, the Cancel logic runs, achieving empty‑rollback control.

Commit or rollback the local transaction based on the protected logic result.

Example Go API:

func ThroughBarrierCall(db *sql.DB, transInfo *TransInfo, busiCall BusiFunc)

The barrier ensures that business code runs only when appropriate, handling empty rollback, idempotency, and hanging automatically, thus greatly reducing developer burden.

Conclusion

The article presented the basic theory of distributed transactions, surveyed common solutions (XA, SAGA, TCC, local message tables, transactional messages, maximum‑effort notification, AT mode), analyzed failure scenarios, and introduced a sub‑transaction barrier technique that simplifies handling of network anomalies in distributed systems.

backendmicroservicestccDistributed TransactionsSagaXA
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.