Backend Development 20 min read

Data Consistency Strategies in Microservices: Transaction Management and Patterns

This article reviews the evolution from traditional local and distributed transactions to BASE theory and presents four microservice data‑consistency patterns—reliable event notification, maximum‑effort notification, business compensation, and TCC—detailing their principles, advantages, drawbacks, and implementation examples.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Data Consistency Strategies in Microservices: Transaction Management and Patterns

1. Transaction Management in Traditional Applications

1.1 Local Transaction

Before discussing microservice data consistency, a brief background on transactions is introduced. Traditional monolithic applications use a single RDBMS as the data source. The application starts a transaction, performs CRUD operations, and commits or rolls back, all within a local transaction managed directly by the resource manager (RM). Data consistency is guaranteed inside this local transaction.

1.2 Distributed Transaction

1.2.1 Two‑Phase Commit (2PC)

When an application expands to use multiple data sources, a single local transaction can no longer guarantee consistency. Distributed transactions are introduced, with the most popular implementation being the two‑phase commit (2PC) managed by a transaction manager (TM).

2PC consists of a prepare phase and a commit phase.

Commit stage illustration.

Rollback stage illustration.

Although 2PC cannot fully guarantee consistency and suffers from synchronous blocking, its optimized version, three‑phase commit (3PC), was later invented.

1.2.2 Three‑Phase Commit (3PC)

3PC can guarantee consistency in most cases, but it is still not a focus of this article.

2. Transaction Management in Microservices

Distributed transactions such as 2PC or 3PC are unsuitable for microservices for three reasons:

Microservices communicate via RPC (e.g., Dubbo) or HTTP APIs, making it impossible for a transaction manager to directly manage the resource managers of each service.

Different services may use heterogeneous data stores, including NoSQL databases that do not support transactions.

Even if all data sources support transactions, a single large transaction spanning many services would hold locks for a much longer time, severely degrading performance.

Therefore, traditional distributed transactions cannot meet microservice requirements, and microservice transaction management must follow the BASE theory.

BASE (Basically Available, Soft state, Eventual consistency) was proposed by eBay architect Dan Pritchett as an extension of CAP, emphasizing eventual consistency when strong consistency is infeasible.

Basically Available : The system tolerates partial loss of availability during failures, ensuring core services remain up.

Soft state : The system may exist in intermediate states that do not affect overall availability; for example, multiple replicas may be out‑of‑sync temporarily.

Eventual consistency : All replicas converge to the same state after a bounded period; it is a special case of weak consistency.

In microservices, eventual consistency is the fundamental requirement. To achieve it, two major categories of solutions exist: notification‑based and compensation‑based. Notification‑based approaches further split into reliable event notification and maximum‑effort notification, while compensation‑based approaches include the TCC (Try‑Confirm‑Cancel) pattern and generic business compensation.

3. Ways to Achieve Data Consistency in Microservices

3.1 Reliable Event Notification Pattern

3.1.1 Synchronous Event

The simplest design is a synchronous event: the primary service performs its business logic, then immediately sends a message (usually via a message queue) to the secondary service. The code example below illustrates the flow.

public void trans() {
    try {
        // 1. Operate database
        bool result = dao.update(data); // throws on failure
        // 2. If DB operation succeeds, send message
        if (result) {
            mq.send(data); // may throw
        }
    } catch (Exception e) {
        roolback(); // rollback on any exception
    }
}

While this looks flawless, two drawbacks exist:

If a network or server crash occurs after the message is sent but before the primary service receives acknowledgment, the primary service may roll back while the secondary service has already consumed the message, causing inconsistency.

The event service becomes tightly coupled with business logic; if the message service is unavailable, the whole business becomes unavailable.

3.1.2 Asynchronous Event

3.1.2.1 Local Event Service

To solve the problems of synchronous events, an asynchronous design decouples the business service from the event service. The business service writes the event to a local event table within the same transaction and attempts to deliver it. If delivery succeeds, the event row is removed; otherwise, a background event service retries until success.

Asynchronous Event Notification – Local Event Service

This approach still incurs some coupling during the first delivery attempt and adds extra load to the database because each business operation also writes to the event table.

3.1.2.2 External Event Service

External event services further isolate the event system from the business service. The business service records the event first, then after the transaction commits (or rolls back) notifies the event service, which finally sends or discards the event. The event service periodically polls for unsent events and queries the business service for their status.

Asynchronous Event Notification – External Event Service

Although this fully decouples the two sides, it introduces two extra network hops and requires the business service to expose a query interface for the event service.

3.1.2.3 Precautions for Reliable Event Notification

Two key concerns are correct event delivery and duplicate consumption. Idempotency ( 幂等性 ) must be ensured on the consumer side. For idempotent state‑change events (e.g., order status), timestamps or global sequence numbers can be used to discard stale messages. For non‑idempotent actions (e.g., monetary transfers), the consumer should persist the event ID and result, checking before processing.

3.2 Maximum‑Effort Notification Pattern

This simpler pattern retries sending a message a limited number of times (e.g., three) after the transaction commits. If all attempts fail, the message is dropped, and the upstream service must provide a query interface for downstream services to recover missing messages. This approach has low real‑time guarantees and is suitable only for scenarios where occasional loss is acceptable.

3.3 Business Compensation Pattern

In compensation patterns, the upstream service depends on the downstream result. When a downstream failure occurs, upstream services execute compensating actions (e.g., cancel a previously booked train ticket). Compensation is usually only partially reversible, leaving a trace (e.g., a “canceled” flag) in the database.

3.4 TCC (Try‑Confirm‑Cancel) Pattern

TCC is an optimized compensation pattern that can achieve full compensation without leaving residual records. It consists of two phases: Try (resource reservation and business checks) and Confirm/Cancel. Only if all Try phases succeed does the system proceed to Confirm; otherwise, Cancel releases the reserved resources.

TCC Pattern

Example: transferring 100 CNY from Bank A to Bank B. Service A reserves 100 CNY (freeze) in the Try phase; Service B reserves the same amount. If both Try phases succeed, Confirm moves the frozen amount to the balance; otherwise, Cancel releases the freeze.

try: update cmb_account set balance=balance-100, freeze=freeze+100 where acc_id=1 and balance>100;
confirm: update cmb_account set freeze=freeze-100 where acc_id=1;
cancel: update cmb_account set balance=balance+100, freeze=freeze-100 where acc_id=1;
try: update cgb_account set freeze=freeze+100 where acc_id=1;
confirm: update cgb_account set balance=balance+100, freeze=freeze-100 where acc_id=1;
cancel: update cgb_account set freeze=freeze-100 where acc_id=1;

3.5 Summary

The table below compares the four common patterns in terms of real‑time consistency, development cost, and whether the upstream service depends on the downstream result.

Type

Name

Real‑time Consistency

Development Cost

Upstream Depends on Downstream

Notification

Maximum Effort

Low

Low

No

Notification

Reliable Event

High

High

No

Compensation

Business Compensation

Low

Low

Yes

Compensation

TCC

High

High

Yes

Source: https://www.jianshu.com/p/b264a196b177

Microservicesdata consistencycompensationtccTransaction Managementevent notification
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.