Mastering Distributed Transactions: 2PC, TCC, and Async Guarantees Explained

This article clarifies core concepts such as transaction compensation, CAP theorem, idempotency, and BASE, compares rigid ACID‑based transactions with flexible BASE‑based approaches, and provides best‑practice guidance for choosing and implementing 2PC, TCC, and asynchronous assurance patterns in distributed systems.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mastering Distributed Transactions: 2PC, TCC, and Async Guarantees Explained

Concept Clarification

Transaction compensation mechanism: every forward transaction operation in a transaction chain must have a fully reversible transaction that complies with rollback rules.

CAP theory: CAP (Consistency, Availability, Partition Tolerance) states that a distributed system can only simultaneously achieve two of these three aspects; common classifications are CP and AP systems.

Idempotency: simply put, business operations support retries without adverse effects, often implemented by adding a unique ID to messages.

BASE (Basically Available, Soft state, Eventually consistent) is a theoretical standard for implementing distributed transactions.

Flexible vs. Rigid Transactions

Rigid transactions strictly follow ACID principles, e.g., single‑machine database transactions.

Flexible transactions follow BASE theory, typically used in distributed environments, with common implementations such as Two‑Phase Commit (2PC), TCC (Try‑Confirm‑Cancel) compensation, message‑based asynchronous assurance, and maximum‑effort notification.

Usually, local transactions use rigid transactions, while distributed transactions use flexible ones.

Best Practices

Start with conclusions, then introduce various distributed transaction implementations.

If strong consistency is required, avoid placing related operations across different services; prefer local transactions over strong‑consistent distributed ones.

If eventual consistency is acceptable, use a message‑based eventual‑consistency solution (asynchronous assurance).

If strong consistency is needed and only distributed deployment is possible, prefer TCC over 2PC.

Note: each solution fits different scenarios and should be chosen based on actual business needs.

Two‑Phase Commit (2PC)

Two‑Phase Commit provides strong consistency and is a typical CP system implementation.

Common standards include XA, JTA, etc.; for example, Oracle databases support XA.

The upper half of the diagram shows a successful 2PC, the lower half shows a failure.

Drawbacks:

In the second phase, the coordinator must wait for all participants to respond yes or any participant to respond no before committing or aborting, leading to long‑lasting locks on multiple resources and performance bottlenecks.

Implementation is complex and hinders system scalability; not recommended.

TCC (Try‑Confirm‑Cancel)

TCC is an AP‑system implementation based on compensation transactions, providing eventual consistency.

Example: customer purchasing a product.

Try: perform all business checks (consistency) and reserve necessary resources (pseudo‑isolation), e.g., verify sufficient account balance and lock customer and merchant accounts.

Confirm: use the reserved resources to execute the business operation (must be idempotent); if an exception occurs, retry. In the example, deduct the customer’s account and credit the merchant’s account.

Cancel: release the resources reserved in the Try phase, i.e., unlock the accounts.

If any sub‑business fails during Confirm, the transaction manager must detect the failure, log it, and trigger compensation transactions, retrying as needed.

Advantages over 2PC:

Resources can be locked, committed, and released independently, allowing faster completion of short‑duration operations without waiting for longer ones.

Each sub‑business handles its own Cancel, enabling a degree of asynchronous parallel execution across services.

Precautions:

The transaction manager (coordinator) must be deployed as a high‑availability cluster with synchronous replication semantics (HAC).

The manager must use a majority‑based algorithm to avoid split‑brain scenarios.

Applicable Scenarios:

Strict consistency required

Short execution time

High real‑time demand (e.g., red packets, payment processing)

Asynchronous Assurance Type

Transforms a series of synchronous transaction operations into asynchronous, message‑driven processes, eliminating blocking in distributed transactions.

This decouples services; the key is asynchronous messaging combined with compensation transactions.

Example flow:

MQ producer sends a remote transaction message to the MQ server.

MQ server acknowledges receipt.

Producer commits its local transaction.

If the local commit succeeds, the MQ server allows the transaction message to be consumed; otherwise, the message is discarded.

If the producer does not respond in time, the MQ server proactively queries the producer for the transaction status.

Upon successful local transaction, the MQ server permits consumers to consume the message.

Consumers must acknowledge (ACK) successful processing; otherwise, the MQ server retries delivery until success.

Precautions:

The message middleware must also be deployed with HAC to ensure no loss of transaction messages.

Depending on business logic, additional requirements such as message deduplication and ordering may be needed.

Applicable Scenarios:

Long execution cycles

Low real‑time requirements

Examples: cross‑bank transfers, refunds, financial/statistical batch processing.

Maximum‑Effort Notification Type

This is the lowest‑requirement distributed transaction model, also implementable via message middleware. Unlike the asynchronous assurance type, after the MQ server delivers the message to the consumer, the transaction may end after reaching the maximum retry count.

Applicable Scenarios:

Transaction result notifications.

Conclusion

Whether using a synchronous transaction manager (coordinator) or an asynchronous message middleware, achieving consistency guarantees requires high‑availability and high‑reliability features provided by HAC with synchronous replication, which inevitably incurs performance costs and becomes a typical bottleneck in SOA architectures.

Source: http://www.uml.org.cn/wfw/201710242.asp

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CAP theorem2PCtccBASEDistributed Transactionstransaction-management
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.