Why Did My Payment Service Lose Data? Uncovering Hidden Transaction Bugs in Spring

A mysterious payment failure where orders appeared successful but were never persisted was traced to a missing transaction commit in a special code path, leading to polluted connections that silently broke subsequent transactions, and the article explains the root cause, debugging steps, fix, and preventive measures.

Top Architect
Top Architect
Top Architect
Why Did My Payment Service Lose Data? Uncovering Hidden Transaction Bugs in Spring

Incident Overview

The online payment service stopped persisting orders: users received a successful payment response, but the order table remained empty and occasional lock‑timeout errors were observed when modifying orders.

Root Cause

A newly deployed business method SomeService.handleSpecialCase() opened a transaction, executed an INSERT, and returned early when a special condition was met. The early return skipped the sqlSession.commit(), leaving the underlying Connection in an active (uncommitted) state.

Spring’s DataSourceTransactionManager obtains a ConnectionHolder from TransactionSynchronizationManager in doGetTransaction(). Because the previous request left the connection marked as active ( isTransactionActive() = true), the same ConnectionHolder was reused for the next request (e.g., PaymentService.createOrder()). The manager therefore considered the request to be part of an existing transaction ( isExistingTransaction() returned true) and did not create a new transaction. During commit, processCommit saw status.isNewTransaction() = false and skipped the real connection.commit(), so the order data never reached the database.

The bug manifested intermittently because TransactionSynchronizationManager stores resources in a ThreadLocal. Requests handled by a clean thread (without a polluted connection) succeeded, while those on a thread that reused the polluted connection failed.

Faulty Code

@Service
public class SomeService {
    public void handleSpecialCase() {
        // open transaction
        sqlSession.connection.setAutoCommit(false);
        // execute SQL
        mapper.insert(data);
        // special case: forget to commit!
        if (specialCondition) {
            return; // commit missed
        }
        sqlSession.commit();
    }
}

Fixed Code

@Service
public class SomeService {
    public void handleSpecialCase() {
        try {
            sqlSession.connection.setAutoCommit(false);
            mapper.insert(data);
            if (specialCondition) {
                sqlSession.commit(); // ensure commit even on special path
                return;
            }
            sqlSession.commit();
        } catch (Exception e) {
            sqlSession.rollback();
            throw e;
        }
    }
}

Spring Transaction Flow (Key Points)

getTransaction()

calls doGetTransaction() which retrieves a ConnectionHolder from TransactionSynchronizationManager. isExistingTransaction() returns true when the holder’s isTransactionActive() flag is set.

If an existing transaction is detected, Spring joins it instead of creating a new one.

During processCommit(), the actual connection.commit() is executed only when status.isNewTransaction() is true. A joined transaction therefore skips the commit.

Why It Occasionally Succeeded

Each thread has its own TransactionSynchronizationManager instance. When a request was processed on a thread that had not previously used the polluted connection, a fresh connection was obtained and the transaction committed normally.

Prevention Measures

1. Connection‑Pool Health Checks

spring:
  datasource:
    hikari:
      connection-test-query: SELECT 1
      validation-timeout: 3000
      connection-init-sql: SET autocommit=1

The connection-init-sql resets the connection state before it is handed out, preventing leftover transaction flags.

2. Database‑Level Monitoring

-- Find transactions running longer than 30 seconds
SELECT *
FROM information_schema.innodb_trx
WHERE TIME_TO_SEC(TIMEDIFF(NOW(), trx_started)) > 30;

Alert on long‑running transactions, lock waits, and abnormal connection counts.

3. Explicit Transaction Management

Always place commit() in the try block’s final step.

Place rollback() in the catch block.

Close resources in a finally block.

4. Source‑Code Debugging

When encountering obscure behavior, set breakpoints in getTransaction and isExistingTransaction to verify whether a connection is being incorrectly reused.

Takeaways

Connection pools can propagate transaction state bugs across unrelated services.

Missing explicit commit/rollback in manually managed transactions can silently corrupt data.

Application logs may appear normal; database‑level metrics are essential for detecting hidden issues.

Debugging the framework’s transaction code often reveals hidden assumptions about connection reuse.

DebuggingdatabaseSpringConnection PoolMySQLTransaction Management
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.