Backend Development 19 min read

Data Consistency in Microservices: Transaction Management Patterns and Practices

The article reviews microservice data consistency challenges, explains why traditional distributed transactions like 2PC/3PC are unsuitable, introduces the BASE theory, and details four implementation patterns—reliable event notification, maximum effort notification, business compensation, and TCC—to achieve eventual consistency.

Java Architect Essentials

Oct 13, 2020

Data Consistency in Microservices: Transaction Management Patterns and Practices

Recently I studied the characteristics of data consistency in microservices and summarized several current approaches for ensuring consistency, providing a high‑level overview without deep implementation details.

1. Transaction Management in Traditional Applications

1.1 Local Transactions

Traditional monolithic applications use a single RDBMS as the data source. The application starts a transaction, performs CRUD operations, and commits or rolls back, all within a local transaction managed directly by the resource manager (RM). Data consistency is guaranteed within this transaction.

1.2 Distributed Transactions

1.2.1 Two‑Phase Commit (2PC)

When an application expands to use multiple data sources, local transactions can no longer guarantee consistency. Distributed transactions, coordinated by a transaction manager (TM), become necessary. The most common protocol is Two‑Phase Commit (2PC), which consists of a prepare phase and a commit phase.

Commit and rollback diagrams follow.

2PC cannot fully guarantee consistency and suffers from synchronous blocking, leading to the invention of Three‑Phase Commit (3PC).

1.2.2 Three‑Phase Commit (3PC)

3PC improves on 2PC but still only guarantees consistency in most cases. Detailed discussions of 2PC/3PC are omitted as they are not the focus of this article.

2. Transaction Management in Microservices

Distributed transactions like 2PC or 3PC are unsuitable for microservices for three main reasons:

Microservices communicate via RPC or HTTP APIs, preventing a single TM from managing all resource managers.

Different services may use heterogeneous data stores, some of which (e.g., NoSQL) do not support transactions.

Even if all stores support transactions, a global transaction would span many services and last orders of magnitude longer than a local transaction, causing extensive locking and performance degradation.

Therefore, traditional distributed transactions cannot meet microservice needs, and the BASE theory becomes the guiding principle.

BASE, proposed by eBay architect Dan Pritchett, extends CAP and stands for Basically Available, Soft state, and Eventual Consistency. Basically Available: The system tolerates partial loss of availability during failures, ensuring core services remain operational. Soft state: The system may hold intermediate states that do not affect overall availability; replication delays exemplify this. Eventual Consistency: All replicas converge to a consistent state after some time, representing a special case of weak consistency.

Eventual consistency is the fundamental requirement for microservice transaction management. Four major patterns can achieve it, divided into notification‑based and compensation‑based approaches.

3. Methods to Achieve Data Consistency in Microservices

3.1 Reliable Event Notification Pattern

3.1.1 Synchronous Events

The simplest form sends a message to downstream services synchronously after the primary service completes its business logic. The following code illustrates the flow:

public void trans() {
    try {
        // 1. Operate database
        bool result = dao.update(data); // throws on failure
        // 2. If DB operation succeeds, send message
        if (result) {
            mq.send(data); // throws on failure
        }
    } catch (Exception e) {
        rollback(); // rollback on any exception
    }
}

While seemingly flawless, synchronous notification has two drawbacks:

Network or server failures after message delivery can cause the primary service to think the message failed, leading to inconsistency.

The message service becomes tightly coupled with business logic; if the message service is unavailable, the entire business flow is blocked.

3.1.2 Asynchronous Events

3.1.2.1 Local Event Service

To address the issues of synchronous events, a local event service records events in a local table within the same transaction. If sending succeeds, the event is removed; otherwise, a background service retries until successful.

Although this improves reliability, it still introduces coupling and additional DB load.

3.1.2.2 External Event Service

Externalizing the event service removes the coupling entirely. The business service records the event before commit, and after commit or rollback notifies the event service, which then sends or discards the event. The event service periodically checks for unsent events and queries the business service for status.

This approach adds extra network hops and requires the business service to expose a query interface.

3.1.2.3 Notes on Reliable Event Notification

Two key concerns are correct delivery and idempotent consumption. Idempotency can be ensured by using unique event IDs and persisting processing results, or by discarding stale events based on timestamps or global sequence numbers.

3.2 Maximum Effort Notification Pattern

Here the business service attempts to send a message a limited number of times after committing. If all attempts fail, the message is considered lost, and the downstream service must provide a query interface for recovery. This pattern is suitable for low‑criticality notifications (e.g., third‑party alerts) but not for strict consistency requirements.

3.3 Business Compensation Pattern

In this pure compensation model, upstream services perform normal commits, and if a downstream service fails, all upstream services execute compensating actions (e.g., canceling a previously booked train ticket). Compensation is typically partial —the original record remains with a “canceled” flag.

3.4 TCC (Try‑Confirm‑Cancel) Pattern

TCC is an optimized compensation approach that achieves full compensation. The workflow consists of:

Try: each service performs checks and reserves required resources.

If all Try phases succeed, Confirm executes the actual business logic using the reserved resources.

If any Try fails, Cancel releases the reserved resources.

Example: transferring 100 CNY from Bank A to Bank B.

Service A (debit):

try: update cmb_account set balance=balance-100, freeze=freeze+100 where acc_id=1 and balance>100;
confirm: update cmb_account set freeze=freeze-100 where acc_id=1;
cancel: update cmb_account set balance=balance+100, freeze=freeze-100 where acc_id=1;

Service B (credit):

try: update cgb_account set freeze=freeze+100 where acc_id=1;
confirm: update cgb_account set balance=balance+100, freeze=freeze-100 where acc_id=1;
cancel: update cgb_account set freeze=freeze-100 where acc_id=1;

The TCC flow ensures that either both accounts are updated or none, without leaving residual state.

3.5 Summary

The following table compares the four common patterns (reliable event notification, maximum effort notification, business compensation, and TCC) in terms of reliability, complexity, and consistency guarantees.

For further reading, see the original article at https://www.jianshu.com/p/b264a196b177 .

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Microservices BASE theory Data Consistency TCC event-driven Transaction Management

Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.