Backend Development 15 min read

Transaction Consistency Strategies in Distributed Microservices: Blocking Retry, Asynchronous Queue, TCC, and Local Message Table

The article explains various techniques for ensuring data consistency in distributed microservice architectures, including blocking retries, asynchronous queues, TCC compensation transactions, local message tables, and MQ transactions, while discussing their advantages, drawbacks, and practical implementation details.

Top Architect
Top Architect
Top Architect
Transaction Consistency Strategies in Distributed Microservices: Blocking Retry, Asynchronous Queue, TCC, and Local Message Table

Introduction

In today's distributed systems and microservice architectures, service-to-service call failures are common. Handling exceptions and guaranteeing data consistency are essential challenges that cannot be avoided.

Depending on the business scenario, different solutions are applicable, such as:

Blocking retry;

Traditional 2PC/3PC transactions;

Using a queue for asynchronous processing;

TCC compensation transactions;

Local message tables (asynchronous assurance);

MQ transactions.

This article focuses on the latter solutions, as the 2PC/3PC topics are already well covered elsewhere.

Blocking Retry

Blocking retry is a common approach in microservice architectures.

Pseudo‑code example:

m := db.Insert(sql)

err := request(B-Service,m)

func request(url string,body interface{}){
  for i:=0; i<3; i ++ {
    result, err = request.POST(url,body)
    if err == nil {
        break 
    }else {
      log.Print()
    }
  }
}

When the API call to service B fails, the request is retried up to three times; if all attempts fail, the error is logged and execution continues or propagates upward.

This method brings the following problems:

The call to service B succeeds, but a network timeout makes the caller think it failed, causing duplicate data.

If service B is unavailable, the retry may leave a dirty record inserted into the database.

Retries increase latency and can amplify downstream load.

Solutions: make B's API idempotent, use background scripts to clean dirty data, and accept the latency trade‑off when consistency is not critical.

Asynchronous Queue

Introducing a message queue is a common and effective evolution of the solution.

m := db.Insert(sql)

err := mq.Publish("B-Service-topic",m)

After writing data to the DB, a message is published to the MQ for an independent consumer to process. However, publishing to the MQ can also fail (network issues, service crash), leading to the same problems as blocking retries.

In long‑running distributed systems, such failure scenarios are inevitable, making reliable design a core difficulty.

TCC Compensation Transaction

TCC (Try‑Confirm‑Cancel) is suitable when transactional guarantees are required but decoupling is difficult.

TCC splits each service call into three phases:

Try: check resources and reserve them (e.g., pre‑deduct inventory).

Confirm: commit the reservation.

Cancel: release the reservation if Try fails.

Example pseudo‑code for a shopping scenario involving services A, B, and C:

m := db.Insert(sql)
aResult, aErr := A.Try(m)
bResult, bErr := B.Try(m)
cResult, cErr := C.Try(m)
if cErr != nil {
    A.Cancel()
    B.Cancel()
    C.Cancel()
} else {
    A.Confirm()
    B.Confirm()
    C.Confirm()
}

The code calls Try on each service; if any Try fails, the corresponding Cancel APIs are invoked to release resources.

Key issues include empty releases (Cancel called when no resource was actually locked) and ordering problems caused by network latency, which can be mitigated by using unique transaction IDs.

Both Cancel and Confirm operations can also fail, leading to locked resources; typical mitigations are blocking retries or logging and manual intervention.

Local Message Table

The local message table, originally proposed by eBay, stores a message record in the same database transaction as the business data, allowing the transaction to guarantee atomicity.

messageTx := tc.NewTransaction("order")
messageTxSql := tx.TryPlan("content")

m,err := db.InsertTx(sql,messageTxSql)
if err!=nil {
 return err
}

aErr := mq.Publish("B-Service-topic",m)
if aErr!=nil { // publish failed
 messageTx.Confirm() // update status to confirm
} else {
 messageTx.Cancel() // delete message
}

If the DB insert succeeds but MQ publish fails, the message remains in the table and can be retried later by an asynchronous worker.

SQL for inserting a local message:

insert into `tcc_async_task` (`uid`,`name`,`value`,`status`) 
values ('?','?','?','?')

The approach works well without requiring additional services, but it couples the message table with the business database.

MQ Transaction

Some MQ implementations (e.g., RocketMQ) support transactions. The workflow mirrors the local message table: a message is first sent to the MQ, then Confirm or Cancel is performed based on the outcome of subsequent operations.

MQ transactions share the same pros and cons as the local message table, with the added complexity of limited MQ transactional support and extra latency.

Conclusion

Ensuring data consistency in distributed systems inevitably requires extra mechanisms such as TCC, local message tables, or MQ transactions. TCC offers flexibility and database‑agnostic guarantees but demands substantial implementation effort; mature frameworks like Alibaba's Fescar can reduce this cost. Local message tables are simple and effective for many scenarios but increase coupling with the business DB. MQ‑based transactions provide a service‑level solution but suffer from limited support and added latency.

backendMicroservicestransactionMessage Queuetccconsistency
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.