Databases 22 min read

Unraveling MySQL InnoDB Deadlocks in Video Conference Sessions

A detailed investigation of intermittent MySQL InnoDB deadlocks occurring during video‑conference session creation, covering log extraction, lock analysis, source‑code debugging, reproducible test cases, and a simple code change that eliminates the deadlock while explaining the underlying locking mechanisms.

dbaplus Community
dbaplus Community
dbaplus Community
Unraveling MySQL InnoDB Deadlocks in Video Conference Sessions

1. Capture Deadlock Information

The issue surfaced as occasional "create meeting" failures. The MySQL error message was:

Deadlock found when trying to get lock; try restarting transaction
Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction

Running SHOW ENGINE INNODB STATUS exported the full deadlock trace, which was sanitized for privacy.

2. Parse the Log

Key observations from the trace:

Transaction 1 (ID 272174) was executing DELETE FROM session … and waited for an X lock on a row in the session table.

Transaction 2 (ID 272193) was executing INSERT INTO session_endpoint … and waited for an insert‑intention lock (gap lock) on the session_endpoint table.

Transaction 2 already held a record lock on the session table.

The database uses InnoDB with the default REPEATABLE‑READ isolation level on MySQL 5.7.

3. Why session_endpoint Blocks

When InnoDB deletes a row it marks it as deleted but keeps the record on the page. If an INSERT later targets the same page, the insert‑intention lock can clash with the lingering record lock. The log shows info bits 32, which corresponds to the deleted‑flag (0x20UL):

// storage/innobase/include/rem0rec.h
#define REC_INFO_DELETED_FLAG 0x20UL  /* record has been delete‑marked */

This proves that Transaction 2 tried to insert exactly where Transaction 1 had just marked a row as deleted, causing the lock conflict.

4. Why session Blocks

Transaction 2 also inserted a new session row (ID 965b1face74948039ab0dd8da6daa71d). While Transaction 1 was batch‑deleting sessions, it eventually reached a row that was still locked by Transaction 2, so the delete waited, creating the second part of the deadlock cycle.

5. Reproduce and Debug

To verify the hypothesis, the following DDL and test data were used:

CREATE TABLE `session` (
  `id` binary(16) NOT NULL,
  `code` varchar(64) DEFAULT NULL,
  `topic` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `code_idx` (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `session_endpoint` (
  `id` binary(16) NOT NULL,
  `nickname` varchar(100) DEFAULT NULL,
  `session_id` binary(16) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `session_id_idx` (`session_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Sample inserts populated both tables with binary UUIDs. The isolation level and version were confirmed with:

SELECT @@tx_isolation;
SELECT @@version;

Two concurrent transactions were scripted: T2 inserted a new session record, then T1 performed the batch delete. The delete blocked on the third step, matching the original deadlock.

Debugging was performed by setting breakpoints in the MySQL source (e.g., storage/innobase/lock/lock0lock.cc, sql/sql_delete.cc, row_search_mvcc). Screenshots of the debugger showed the lock mode value mode = LOCK_X | LOCK_ORDINARY (value 3) and the exact point where lock_rec_lock_slow waited for the conflicting lock.

Deadlock log excerpt
Deadlock log excerpt
SQL blocked screenshot
SQL blocked screenshot

6. Fix the Deadlock

The root cause was the inconsistent order of operations on the two tables. Swapping the delete order resolves the circular wait:

@Transactional(rollbackFor = Exception.class)
public void clearSessions(List<UUID> sessionIds) {
    sessionRepository.deleteSessionByIds(sessionIds);
    sessionEndpointRepository.deleteSessionEndpointBySessionIds(sessionIds);
}

After the change, automated stress tests that create and clean up thousands of empty meetings no longer produce deadlocks.

7. Remaining Questions

Why does InnoDB continue scanning past the last targeted row instead of exiting the loop immediately?

Why does the engine read the next record after the last one and then another unrelated record?

What exactly causes the engine to encounter an uncommitted insert from another transaction during the scan?

These nuances are tied to InnoDB’s internal MVCC and page‑level lock handling and merit further source‑code investigation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Lock AnalysisdatabasedeadlockInnoDBmysql
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.