Backend Development 7 min read

Understanding Distributed Transactions and the XA Two‑Phase Commit Protocol

The article explains how distributed transactions work in microservice architectures, using inventory‑order examples and a World of Warcraft raid analogy to illustrate the XA two‑phase commit protocol, its normal and failure flows, limitations, and alternative approaches such as three‑phase commit, message‑queue and TCC transactions.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding Distributed Transactions and the XA Two‑Phase Commit Protocol

What if there is no distributed transaction?

In a microservice system, imagine a typical e‑commerce transaction where an inventory service and an order service each maintain their own databases. A product purchase first calls the inventory service to deduct stock, then calls the order service to create an order record.

Under normal conditions both databases are updated successfully and data remains consistent.

In abnormal cases the inventory deduction may succeed while the order insertion fails, leading to data inconsistency.

What is a distributed transaction?

A distributed transaction guarantees data consistency across multiple nodes in a distributed system. The most representative implementation is the XA protocol originally proposed by Oracle Tuxedo.

XA includes two‑phase commit (2PC) and three‑phase commit (3PC); this article focuses on the two‑phase commit process.

Two‑Phase Commit – Normal Flow

First phase (Prepare): the transaction coordinator sends a Prepare request to all participants. Each participant executes its local updates, writes undo/redo logs, and replies with a “ready” message without committing.

Second phase (Commit): after receiving all “ready” messages, the coordinator sends a Commit request. Participants commit locally, release locks, and acknowledge completion. The transaction is then considered finished.

Two‑Phase Commit – Failure Handling

If any participant reports failure in the first phase, the coordinator aborts the transaction. In the second phase it sends an Abort request, and participants roll back using the undo log.

Analogy with World of Warcraft Raid

When a raid leader initiates a “ready check”, each player responds “yes” if prepared or “no” otherwise. Only when all players confirm does the leader start the boss fight. This mirrors the prepare‑and‑commit steps of XA.

Shortcomings of XA Two‑Phase Commit

Performance overhead: all participants hold resources until the commit phase, reducing throughput.

Coordinator single‑point‑of‑failure: if the coordinator crashes, participants may remain in an uncertain state.

Message loss in the commit phase can cause inconsistency across nodes.

Alternative Solutions

Three‑Phase Commit (3PC): adds a “CanCommit” phase and timeout mechanisms to mitigate coordinator failure, though performance and inconsistency issues remain.

Message‑Queue (MQ) Transactions: use asynchronous messaging to achieve eventual consistency, avoiding the heavy locking of XA.

TCC (Try‑Confirm‑Cancel): implements transaction logic in application code with explicit try, confirm, and cancel steps, offering more flexibility.

Note: The comic illustrations are for entertainment only; please do not imitate the depicted behavior.

Source: Original Article

microservicesconsistencydistributed transactionsTransaction ManagementTwo-Phase CommitXA protocol
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.