Databases 23 min read

Mastering Distributed Transactions: From ACID to TCC and Sharding Strategies

This article provides a comprehensive technical guide to distributed transactions, covering ACID fundamentals, consistency models, sharding techniques, CAP and BASE theories, and detailed implementations of two‑phase, three‑phase, and TCC protocols, while highlighting their advantages, limitations, and practical considerations.

IT Architects Alliance

Mar 7, 2021

Mastering Distributed Transactions: From ACID to TCC and Sharding Strategies

What Is a Distributed Transaction?

In everyday life a transaction either completes fully or not at all; similarly, a distributed transaction must guarantee atomic execution across multiple databases despite network unreliability, latency, or failures such as power loss.

ACID Properties

Atomicity : A transaction cannot be split; all its steps succeed together or none at all.

Consistency : The total amount of money before and after a transfer remains unchanged, ensuring no intermediate state violates business rules.

Isolation : Concurrent transactions do not interfere; for example, two independent transfers involving different accounts run without affecting each other.

Durability : Once committed, changes are persisted to disk and survive crashes.

Consistency Models

Beyond strict ACID consistency, systems adopt:

Strong consistency : Every read sees the latest write.

Weak consistency : Reads may return stale data; updates are not immediately propagated.

Eventual consistency : Data converges to a consistent state over time.

Choosing the appropriate model depends on business needs—for payments strong consistency is required, while product catalog updates may tolerate weaker guarantees.

Sharding (分库分表)

Sharding splits data across multiple databases to alleviate single‑node bottlenecks. Two main approaches:

Vertical sharding : Separate tables with low coupling into different databases (e.g., user profile vs. order history).

Horizontal sharding : Partition rows by a key range or hash (e.g., userId 1‑10000 in DB1, 10001‑20000 in DB2).

Vertical sharding improves modularity and reduces I/O contention but may require cross‑service joins, increasing complexity. Horizontal sharding keeps table size manageable and enables linear scaling, yet hot‑spot rows can become performance bottlenecks.

Transaction Challenges After Sharding

When a transaction spans multiple shards, it becomes a distributed transaction. Coordination across nodes introduces latency, potential partial failures, and the risk of deadlocks, making strong consistency harder to achieve.

CAP and BASE Theories

The CAP theorem states that in the presence of a network partition, a system must choose between consistency and availability. Modern large‑scale systems often favor availability and settle for BASE (Basically Available, Soft state, Eventually consistent) rather than strict ACID.

Two‑Phase Commit (2PC)

2PC ensures atomicity across nodes via a coordinator and participants:

Coordinator sends a prepare (vote) request to all participants.

Each participant executes the transaction locally, logs it, and replies with success or failure.

In the second phase:

If all participants voted success, the coordinator sends a commit command.

If any participant failed or timed out, the coordinator sends a rollback command.

Images illustrate the state transitions and message flow.

Drawbacks of 2PC include a single point of failure at the coordinator, blocking behavior while waiting for votes, and possible data inconsistency if commit messages are lost.

Timeout and Mutual‑Inquiry Mechanisms

To mitigate blocking, coordinators can abort after a timeout, and participants can query each other (mutual‑inquiry) to infer the global decision when the coordinator is unreachable.

Three‑Phase Commit (3PC)

3PC adds a pre‑commit (can‑commit) phase and timeout handling to reduce blocking:

Pre‑inquiry : Coordinator asks participants if they can commit.

Pre‑commit : If all agree, participants lock resources but do not commit.

Commit : Coordinator issues the final commit.

If any participant rejects or a timeout occurs, the coordinator aborts, avoiding the indefinite wait seen in 2PC. However, 3PC incurs higher latency and is less widely adopted.

TCC (Try‑Confirm‑Cancel) Pattern

TCC splits a transaction into three explicit steps:

Try : Reserve resources and perform business checks without committing.

Confirm : Finalize the reservation, making changes permanent.

Cancel : Release reserved resources if the transaction cannot be completed.

The workflow mirrors 2PC but operates at the service layer, with each step being a local transaction that can be committed independently.

Images show the TCC flow and its similarity to 2PC.

Conclusion

Distributed transaction management balances consistency, availability, and performance. Selecting the right consistency model, sharding strategy, and commit protocol (2PC, 3PC, or TCC) depends on workload characteristics, latency tolerance, and failure scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

BASE theory CAP theorem sharding tcc Distributed Transactions two-phase commit database-consistency three-phase commit

Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.