Why Seata’s Global Locks Can Kill High‑QPS Services—and What to Do Instead

The author recounts 18 months of using Seata for distributed transactions, explains its three‑role architecture and AT mode, details how global locks caused severe latency and deadlocks under load, and shows how switching to a transactional outbox pattern restored performance and eliminated the undo_log bloat.

Java Web Project
Java Web Project
Java Web Project
Why Seata’s Global Locks Can Kill High‑QPS Services—and What to Do Instead

How Seata Works

Seata consists of three roles: TM (Transaction Manager), RM (Resource Manager for each service), and TC (Transaction Coordinator). In AT mode, before each SQL execution Seata captures a snapshot and stores it in an undo_log table; on rollback the snapshot restores the data.

During the two‑phase commit, the row being modified acquires a global lock that persists until the global transaction commits.

First Pain: Load Test

During a pre‑release full‑stack load test the order service reached 500 QPS. The DBA reported a flood of database lock waits. Monitoring showed the P99 latency jumped from 80 ms to 640 ms.

Investigation revealed that 500 concurrent requests all tried to lock the same inventory row, each also incurring an extra ~20 ms round‑trip to the TC. The global lock became the bottleneck.

Performance data collected:

QPS 100 – P99 82 ms – no DB lock – ✅ normal

QPS 300 – P99 160 ms – occasional lock – ⚠️ flaky

QPS 500 – P99 640 ms – heavy lock contention – ❌ alarm

QPS 800 – timeout – deadlock – ❌ circuit‑break

From One Pit to Another

To avoid the global lock the team switched to TCC mode, which eliminates the lock but requires implementing three interfaces (Try, Confirm, Cancel) for each business operation. The author discovered “empty rollback” cases where Cancel may be invoked before Try, and had to handle idempotency, hanging, and other edge cases across 17 interfaces.

Realising that TCC re‑introduced invasive code, the author concluded that the original promise of a non‑intrusive framework was broken.

TC – A Hidden Single Point of Failure

The TC is a separately deployed coordinator that every global transaction must contact. One afternoon the TC cluster experienced a 30‑second network glitch, causing all cross‑service transactions to hang. The order API timed out, connection pools filled, and the outage lasted nearly four minutes.

This incident highlighted that introducing a global coordinator creates a single point of failure that can affect the entire system.

The Decisive Screenshot

The DBA later sent a screenshot showing the undo_log table had grown to 18 GB and kept growing, because high‑concurrency AT mode generated massive lock‑protected snapshots that the cleanup task could not keep up with.

Switching to Transactional Outbox

After removing Seata, the team adopted a local outbox table, a message queue (RocketMQ), and idempotent consumers. The pattern writes the business record and a pending message into the same local transaction; on commit both rows exist, on rollback both disappear.

An independent process polls the outbox table and publishes messages to the queue. Consumers handle retries, dead‑letter queues, and alerts.

Within a month the order service P99 latency dropped from 640 ms to 68 ms, CPU peak halved, and the undo_log table was dropped entirely.

When to Use Seata

Seata is suitable when:

Concurrency is low (peak QPS < 100)

Strong consistency is mandatory (e.g., financial accounting, ticket verification)

The team has dedicated staff to maintain the TC cluster

For high‑concurrency internet services (QPS 500+), hot‑spot data (flash sales, popular inventory), or small fast‑moving teams without dedicated TC maintenance, a message‑driven outbox approach is recommended.

This solution avoids complex frameworks, keeps the architecture simple, and is easy for new developers to understand.

performancemicroservicesdistributed transactionsSeataTransactional Outbox
Java Web Project
Written by

Java Web Project

Focused on Java backend technologies, trending internet tech, and the latest industry developments. The platform serves over 200,000 Java developers, inviting you to learn and exchange ideas together. Check the menu for Java learning resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.