Databases 22 min read

Understanding Distributed Transactions, Consistency Models, and Sharding in Database Systems

This article explains the fundamentals of distributed transactions, the ACID properties, various consistency models (strong, weak, eventual), sharding strategies (vertical and horizontal), the CAP and BASE theories, and the practical implementations of two‑phase, three‑phase, and TCC commit protocols, highlighting their advantages and drawbacks.

Top Architect
Top Architect
Top Architect
Understanding Distributed Transactions, Consistency Models, and Sharding in Database Systems

What Is a Distributed Transaction

A distributed transaction spans multiple databases or services and must obey the ACID properties—Atomicity, Consistency, Isolation, and Durability—while handling network unreliability and partial failures.

ACID Explained

Atomicity : a transaction cannot be split; it either fully succeeds or fully fails.

Consistency : the system moves from one valid state to another, preserving invariants.

Isolation : concurrent transactions do not interfere with each other.

Durability : once committed, the changes survive crashes.

Consistency Models

Three levels are discussed:

Strong consistency : every read sees the latest write.

Weak consistency : reads may return stale data, acceptable when the application can tolerate temporary divergence.

Eventual consistency : data converges to a consistent state over time.

Sharding (分库分表)

Sharding splits data across multiple databases to overcome single‑node bottlenecks.

Vertical Sharding

Tables with low coupling are placed in separate databases, similar to micro‑service boundaries.

Horizontal Sharding

Rows are partitioned by a key range or hash, allowing each shard to scale independently.

CAP and BASE Theories

The CAP theorem states that a distributed system can only guarantee two of the three properties: Consistency, Availability, and Partition tolerance. BASE (Basically Available, Soft state, Eventually consistent) relaxes strict consistency to achieve higher availability.

Two‑Phase Commit (2PC)

2PC uses a coordinator and participants: the first phase votes, the second phase commits if all votes are positive, otherwise rolls back. Drawbacks include a single point of failure and blocking.

Three‑Phase Commit (3PC)

3PC adds a pre‑commit (can_commit) phase and timeout handling to reduce blocking, but it is more complex and still suffers from performance penalties.

TCC (Try‑Confirm‑Cancel)

TCC moves the transaction logic to the service layer: Try reserves resources, Confirm finalizes them, and Cancel releases them. It provides idempotent operations and better fault isolation.

Practical Considerations

Each protocol has trade‑offs between consistency, latency, and fault tolerance. Choosing the right approach depends on business requirements such as strong consistency for financial transfers versus eventual consistency for high‑throughput, low‑criticality data.

CAP theoremSharding2PCtccBASEdistributed transactionsdatabase consistency3PC
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.