Phantom Read Verification and Analysis in InnoDB REPEATABLE‑READ Isolation
This article investigates whether the REPEATABLE‑READ isolation level in InnoDB fully prevents phantom reads by conducting a series of current‑read and snapshot‑read experiments, analyzing transaction behavior, MVCC mechanisms, and offering recommendations to avoid phantom anomalies.
Background
InnoDB’s default transaction isolation level is REPEATABLE‑READ, which uses both lock‑based concurrency control (LBCC) for current reads and multi‑version concurrency control (MVCC) for snapshot reads. The article questions whether phantom reads are completely eliminated under this level.
Verification
2.1 Preparation
Environment Information
MySQL version: 5.6.36
Storage engine: InnoDB
Isolation level: REPEATABLE-READData Preparation
A test table user_info is created and three initial rows are inserted.
CREATE TABLE `user_info` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`name` VARCHAR(32) NOT NULL DEFAULT '' COMMENT 'name',
`gender` VARCHAR(32) NOT NULL DEFAULT '' COMMENT 'gender',
`email` VARCHAR(32) NOT NULL DEFAULT '' COMMENT 'email',
PRIMARY KEY (`id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8mb4 COMMENT='user info table';
INSERT INTO `user_info` (`id`,`name`,`gender`,`email`) VALUES
(1,'Curry','男','[email protected]'),
(2,'Wade','男','[email protected]'),
(3,'James','男','[email protected]');
COMMIT;2.2 Current‑Read Scenario
The test uses SELECT ... LOCK IN SHARE MODE to perform a current read.
START TRANSACTION; (Tx1)
SELECT * FROM user_info LOCK IN SHARE MODE;
START TRANSACTION; (Tx2)
INSERT a new row into user_info (Tx2).
COMMIT Tx2 (fails with lock‑wait timeout, so the insert is rolled back).
SELECT * FROM user_info LOCK IN SHARE MODE; (Tx1 again)
COMMIT Tx1.
Result: Tx2’s insert is blocked and rolled back, so Tx1 does not see the new row – no phantom read occurs.
Summary of Current‑Read
Because the gap lock (temporary key lock) blocks inserts, LBCC fully prevents phantom reads in this scenario.
2.3 Snapshot‑Read Scenario
Three sub‑scenarios are executed using plain SELECT (snapshot reads) combined with updates.
Scenario 1
START TRANSACTION; (Tx1)
SELECT * FROM user_info;
START TRANSACTION; (Tx2)
INSERT a new row (Tx2).
COMMIT Tx2.
SELECT * FROM user_info; (Tx1 again)
COMMIT Tx1.
Result: Tx1 sees the row inserted by Tx2 because the snapshot view was created before Tx1’s first read and MVCC makes the new row visible after Tx2 commits – no phantom read.
Scenario 2
START TRANSACTION; (Tx1)
SELECT * FROM user_info;
START TRANSACTION; (Tx2)
INSERT a new row (Tx2).
COMMIT Tx2.
UPDATE user_info SET name='Iversen' WHERE id=1; (Tx1)
SELECT * FROM user_info; (Tx1 again)
COMMIT Tx1.
Result: The update does not touch the row inserted by Tx2, so Tx1’s second read still does not see the new row – no phantom read.
Scenario 3
START TRANSACTION; (Tx1)
SELECT * FROM user_info;
START TRANSACTION; (Tx2)
INSERT a new row (Tx2).
COMMIT Tx2.
UPDATE user_info SET email=REPLACE(email,'@163.com','@gmail.com'); (Tx1)
SELECT * FROM user_info; (Tx1 again)
COMMIT Tx1.
Result: After the bulk update, Tx1 sees the row inserted by Tx2, demonstrating a phantom read caused by the update touching the newly inserted row.
Summary of Snapshot‑Read
Under REPEATABLE‑READ, phantom reads can occur when a transaction performs an update between two snapshot reads and the update accesses rows inserted by another transaction.
Analysis
The current‑read (LBCC) uses gap locks to block inserts into the range being read, thus preventing phantom rows. Snapshot‑read relies on MVCC; its read view is fixed at the first read, but subsequent updates can make previously invisible rows visible, leading to phantom reads. The article explains the MVCC visibility algorithm and shows how undo logs and transaction IDs affect visibility.
Conclusion
Current‑read with lock‑based concurrency fully avoids phantom reads.
Snapshot‑read may produce phantom reads when an update accesses rows inserted by another transaction.
To avoid phantom reads: use serializable isolation (not recommended), design application logic to prevent the problematic pattern, or explicitly use current‑read locking where appropriate.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.