Backend Development 19 min read

Data Consistency in Microservices: Transaction Management and Implementation Patterns

This article introduces the limitations of traditional local and distributed transactions for microservices, explains the BASE theory, and details four practical patterns—reliable event notification, maximum‑effort notification, business compensation, and TCC—providing code examples, diagrams, and a comparative table to guide developers in achieving eventual consistency across microservice architectures.

Architecture Digest

Nov 3, 2020

Data Consistency in Microservices: Transaction Management and Implementation Patterns

1. Traditional Application Transaction Management

1.1 Local Transaction

Before discussing microservice data consistency, a brief overview of transaction basics is given. In monolithic applications a single RDBMS provides local transactions where CRUD operations are committed or rolled back within the same resource manager, guaranteeing data consistency.

1.2 Distributed Transaction

1.2.1 Two‑Phase Commit (2PC)

When an application accesses multiple data sources, local transactions are insufficient and distributed transactions become necessary. The most common implementation is the two‑phase commit (2PC), coordinated by a transaction manager (TM) that first prepares all resources and then commits them.

2PC consists of a prepare phase and a commit phase.

Commit illustration

Rollback illustration

Although widely used, 2PC cannot fully guarantee consistency and suffers from blocking, which led to the invention of the three‑phase commit (3PC).

1.2.2 Three‑Phase Commit (3PC)

3PC improves on 2PC but still only guarantees consistency in most cases; detailed protocols are omitted as they are not the focus of this article.

2. Transaction Management in Microservices

Distributed 2PC/3PC are unsuitable for microservices for three main reasons:

Microservices communicate via RPC or HTTP APIs, preventing a single TM from managing all resources.

Different services may use heterogeneous data stores, some of which (e.g., NoSQL) lack transaction support.

Coordinating a large, cross‑service transaction dramatically increases lock duration and harms performance.

Consequently, microservices must adopt the BASE theory (Basically Available, Soft state, Eventual consistency) proposed by eBay architect Dan Pritchett. Basically Available: the system tolerates partial loss of availability during failures while keeping core services alive. Soft State: intermediate states are allowed and do not affect overall availability; replicas may be out‑of‑sync temporarily. Eventual Consistency: all replicas converge to the same state after some time, providing a weaker but acceptable consistency model for microservices.

Achieving eventual consistency in microservices can be done via two broad categories of patterns: event‑notification and compensation, each with sub‑patterns.

3. Implementing Data Consistency in Microservices

3.1 Reliable Event Notification Pattern

3.1.1 Synchronous Event

The simplest approach is to send a message synchronously after the primary service completes its work. The following Java‑like code illustrates the logic:

public void trans() {
    try {
        // 1. Operate database
        bool result = dao.update(data); // throws on failure
        // 2. If DB succeeded, send message
        if (result) {
            mq.send(data); // throws on failure
        }
    } catch (Exception e) {
        roolback(); // rollback on any exception
    }
}

While seemingly flawless, synchronous notification suffers from two drawbacks:

Network or server failures after the message is sent can cause the primary service to think the notification failed, leading to inconsistency.

The messaging service becomes tightly coupled with business logic; if the message broker is unavailable, the whole business flow is blocked.

3.1.2 Asynchronous Event

3.1.2.1 Local Event Service

To address the issues of synchronous events, an asynchronous model introduces a separate event service. The business service writes events to a local event table within the same transaction; a background worker retries delivery until successful.

Although reliable, this approach still incurs extra DB load and partial coupling.

3.1.2.2 External Event Service

The external event service further decouples the business and messaging layers. The business service records events without sending them; after the transaction commits (or rolls back), it notifies the event service, which then delivers or discards the events.

This adds two network hops and requires the business service to expose a query interface for the event service to check pending events.

3.1.2.3 Notes for Reliable Event Pattern

The pattern must ensure (1) correct delivery of events and (2) idempotent consumption. Idempotency can be achieved by making the event itself idempotent (e.g., order‑status updates) and using timestamps or global sequence numbers to discard stale messages, or by persisting event IDs and results to detect duplicates.

3.2 Maximum‑Effort Notification Pattern

In this simpler approach the business service attempts to send a message a limited number of times after committing its transaction. If all attempts fail, the message is lost and the downstream service must provide a query API for recovery. This pattern offers low development cost but weak real‑time guarantees.

3.3 Business Compensation Pattern

Here the upstream service proceeds normally, but if a downstream service fails, the upstream services perform compensating actions (e.g., cancel a previously booked ticket). Compensation is usually incomplete—records remain with a “canceled” flag—so the system retains an audit trail.

3.4 TCC (Try‑Confirm‑Cancel) Pattern

TCC refines compensation by providing a fully reversible workflow. In the Try phase each service reserves required resources; if all Try phases succeed, the Confirm phase finalizes the operation; otherwise the Cancel phase releases the reservations.

Example: a transfer from Bank A to Bank B.

Service A (debit):

try:   update cmb_account set balance=balance-100, freeze=freeze+100 where acc_id=1 and balance>100;
confirm: update cmb_account set freeze=freeze-100 where acc_id=1;
cancel:  update cmb_account set balance=balance+100, freeze=freeze-100 where acc_id=1;

Service B (credit):

try:   update cgb_account set freeze=freeze+100 where acc_id=1;
confirm: update cgb_account set balance=balance+100, freeze=freeze-100 where acc_id=1;
cancel:  update cgb_account set freeze=freeze-100 where acc_id=1;

The TCC workflow ensures atomicity without holding long‑lived locks, at the cost of implementing both Confirm and Cancel interfaces for each service.

3.5 Summary

The table below compares the four common patterns in terms of consistency latency, development cost, and whether the upstream service depends on downstream results.

Type

Name

Consistency Real‑time

Development Cost

Upstream Depends on Downstream

Notification

Maximum Effort

Low

Notification

Reliable Event

High

Compensation

Business Compensation

Low

Yes

Compensation

TCC

High

Yes

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems BASE theory Data Consistency TCC event-driven Transaction Management

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.