Backend Development 14 min read

Ensuring Distributed Final Consistency: Heavy and Light Approaches, Principles and Practices

The article explains distributed final consistency challenges, compares heavyweight transaction frameworks with lightweight techniques such as idempotency, retries, state machines, recovery logs, and async verification, and outlines CAP, BASE principles and practical implementation steps for backend systems.

Hujiang Technology

Nov 26, 2018

Ensuring Distributed Final Consistency: Heavy and Light Approaches, Principles and Practices

Introduction

Distributed consistency issues appear everywhere; even a single machine can be seen as a tiny distributed system that relies on the operating system and database to guarantee eventual consistency. In large‑scale microservice architectures, ACID transactions are insufficient, and cross‑service calls must ensure distributed final consistency.

Current mainstream distributed‑transaction frameworks fall into three categories:

Two‑phase commit (2PC) based on the XA protocol

TCC (Try‑Confirm‑Cancel) originally proposed by Alipay

Message‑queue asynchronous assurance originally proposed by eBay

In addition, lighter solutions such as idempotency/retry, state machines, recovery logs, and asynchronous verification can be chosen according to business needs.

Heavy Weapons

Using a distributed‑transaction framework makes final consistency transparent to developers; the framework handles the complexity as an aspect. However, 2PC/TCC/Message‑queue solutions have drawbacks: protocol fragility, high latency due to multiple round‑trips, and the need for asynchronous validation to cover edge cases. Nevertheless, they greatly reduce development effort and are ideal for domains like banking that demand strict consistency and code quality.

Light Weapons

Programmers can also achieve final consistency through idempotency/retry, state machines, recovery logs, and asynchronous verification. These approaches are platform‑agnostic and lightweight, but they require developers to implement consistency logic themselves, which can introduce bugs.

The major distributed‑transaction frameworks themselves rely on these basic mechanisms, so the article focuses on the latter set of techniques.

Principles

1. CAP Theorem

The three properties are Consistency, Availability, and Partition tolerance; a distributed system can satisfy at most two simultaneously.

Consistency: every read receives the latest write or an error

Availability: every request receives a non‑error response

Partition tolerance: the system continues operating despite lost or delayed messages

Because partition tolerance is essential for scalability, most systems choose between Consistency and Availability.

2. BASE Principle

Basically Available

Soft state

Eventual consistency

BASE relaxes strong consistency in favor of AP, allowing applications to achieve eventual consistency through appropriate techniques; NoSQL databases like Cassandra follow BASE, while some systems (e.g., HBase) opt for CP.

Practices

1. Retry

Retry mechanisms automatically recover inconsistent data, provided the retried operation is idempotent. Two styles are common:

Sync retry: a failed request is re‑issued synchronously, which can waste threads and amplify traffic.

Async retry: failures are re‑processed via a message queue or scheduler, often with exponential back‑off and alerting after a threshold.

Retry also improves availability; for example, a service with 98% availability reaches 99.96% after one retry and 99.9992% after two retries.

2. Idempotency

Mathematically, idempotency means: f(f(x)) = f(x) In practice, executing the same operation multiple times yields the same effect as a single execution. Techniques include:

Pessimistic lock via SELECT FOR UPDATE (requires AUTOCOMMIT=0, not suitable for high‑concurrency).

Optimistic lock with version numbers, e.g.,

UPDATE stocktable SET stock = stock - 1, version = version + 1 WHERE product_id = 123 AND version = 1

Unique‑index deduplication tables.

Global token checks using Redis atomic increments or MySQL unique indexes.

3. State Machine

A state machine models entity state transitions. Example: the lifecycle of a WeChat red‑packet.

INIT – creation after user submits amount and count.

SPLITTED – system splits amount into n parts.

PAID – asynchronous payment success.

NOTIFIED – notification sent to group.

RUNOUT – all packets claimed.

REFUND – timeout refund if not claimed.

Design state machines with minimal branching to simplify rollback and correction.

4. Recovery Log

Recovery logs record the full lifecycle of a request using a globally unique requestId:

REQUEST START requestId

For each modification, record (requestId, x, originalValue, destValue).

REQUEST END requestId

These logs enable redo/undo operations to bring the system back to a consistent state after failures.

5. Asynchronous Verification

Beyond Paxos, most business scenarios add external constraints and async verification to detect anomalies and trigger recovery via scripts or manual correction. Verification can be:

Entity‑level: send business IDs to a verification service through a message queue.

Time‑window batch: schedule periodic checks based on timestamps.

If a reliable message queue or scheduler is unavailable, a local verification flag can be used, with the verification system updating the flag and retrying failed items.

Conclusion

Distributed final consistency is a frequent challenge for backend developers. In practice, a combination of idempotency, retry, asynchronous verification, state machines, and recovery logs is often required to achieve reliable consistency across microservices.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Microservices transaction CAP theorem Idempotency Consistency BASE

Written by

Hujiang Technology

We focus on the real-world challenges developers face, delivering authentic, practical content and a direct platform for technical networking among developers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.