Investigation of Duplicate Key Errors After MySQL 5.6→5.7 Upgrade and Auto‑Increment Bugs
The article analyzes why MySQL tables upgraded from 5.6 to 5.7 began generating duplicate‑key errors during inserts, examines related InnoDB auto‑increment mechanisms, reproduces the issue, discusses known bugs (including replace‑into replication problems), and proposes both operational and kernel‑level solutions.
Problem Description
After upgrading an important MySQL client’s tables from 5.6 to 5.7, duplicate‑key errors appeared during inserts on the master, standby, and read‑only instances. For one table, the auto‑increment value shown by SHOW CREATE TABLE changed from 1758609 before migration to 1758598 after, while the maximum existing id remained 1758609. The tables use the InnoDB engine, and the issue could be temporarily resolved by restarting the server.
Kernel Investigation
Because the problem only occurred on 5.7, the first suspicion was a kernel bug. A search of the official MySQL bug list revealed a similar bug, which is briefly introduced below.
Background 1: InnoDB Auto‑Increment Parameters and Data Structures
Key parameters include innodb_autoinc_lock_mode , auto_increment_increment , and auto_increment_offset . Important structures are the data‑dictionary object that stores the current auto‑increment value and its lock, the transaction object that tracks rows processed within a transaction, and the handler object that holds per‑row iteration information.
Background 2: InnoDB Auto‑Increment Access and Modification Process
(1) When a dict_table_t is cached or evicted, its auto‑increment value is saved to or restored from a global mapping table via dict_table_add_to_cache and dict_table_remove_from_cache_low . (2) row_import and TRUNCATE TABLE update the auto‑increment. (3) On the first handler open, the engine queries the maximum auto‑increment column value and initializes the in‑memory data_dict_t with max+1 . (4) Insert flow – the stack that updates auto‑increment is shown in the code snippet below.
ha_innobase::write_row:write_row的第三步中调用handler句柄中的update_auto_increment函数更新auto increment的值
handler::update_auto_increment: 调用InnoDB接口获取一个自增值,并根据当前的auto_increment相关变量的值调整获取的自增值;同时设置当前handler要处理的下一个自增列的值。
ha_innobase::get_auto_increment:获取dict_table中的当前auto increment值,并根据全局参数更新下一个auto increment的值到数据字典中
ha_innobase::dict_table_autoinc_initialize:更新auto increment的值,如果指定的值比当前的值大则更新。
handler::set_next_insert_id:设置当前事务中下一个要处理的行的自增列的值。(5) Update row – for statements like INSERT ... ON DUPLICATE KEY UPDATE , the auto‑increment is advanced regardless of whether the unique key already exists. The relevant code is:
if (error == DB_SUCCESS
&& table->next_number_field
&& new_row == table->record[0]
&& thd_sql_command(m_user_thd) == SQLCOM_INSERT
&& trx->duplicates) {
ulonglong auto_inc;
...
auto_inc = table->next_number_field->val_int();
auto_inc = innobase_next_autoinc(auto_inc, 1, increment, offset, col_max_value);
error = innobase_set_max_autoinc(auto_inc);
...
}Bug 76872 / 88321: "InnoDB AUTO_INCREMENT produces same value twice"
(1) Bug summary: When autoinc_lock_mode > 0 and auto_increment_increment > 1, a server restart followed by concurrent inserts can generate duplicate‑key errors.
(2) Cause analysis: After a restart, InnoDB sets the auto‑increment to max(id)+1 . The first insert thread obtains this value, then the handler recomputes the next value based on global parameters, resulting in the same next_id being assigned to two concurrent threads.
(3) Fix: Ensure the engine’s auto‑increment calculation respects the global parameters so that the first thread receives next_id instead of max(id)+1 , and protect the update with proper locking.
The bug only manifests when autoinc_lock_mode > 0 and auto_increment_increment > 1 . In the production environment both parameters are set to 1, so this bug can be ruled out.
On‑Site Analysis and Reproduction
Since the official bug did not match the observed behavior, the investigation started from the symptoms.
(1) Examining the pattern of max id and auto‑increment values revealed that the last batch of transactions updated only the auto‑increment column without advancing the global auto‑increment counter. This suggested that an UPDATE on the auto‑increment column had occurred.
Reproduction steps were performed, and the binlog showed UPDATE operations on the auto‑increment column. However, because the binlog was in ROW format, it was unclear whether the engine or the user caused the change, so the client was consulted and confirmed no manual updates were performed.
(2) Further analysis identified three sharded tables, all with auto‑increment primary keys. Two of them (hz_notice_stat_sharding, hz_notice_group_stat_sharding) exhibited the error, while hz_freeze_balance_sharding did not. The differing behavior correlated with the use of REPLACE INTO on the problematic tables versus plain UPDATE on the healthy one.
Bug #87861: "Replace into causes master/slave have different auto_increment offset values"
Root cause: In ROW‑format binlog, REPLACE INTO is logged as an UPDATE event. The master performs a delete‑plus‑insert, correctly advancing the auto‑increment, but the slave replays the event as an update, which changes the column value but does not advance the global counter. When the slave is promoted to master, subsequent inserts encounter duplicate‑key errors.
Solutions
Business‑side options: (1) Change the binlog format to MIXED or STATEMENT. (2) Replace REPLACE INTO with INSERT ... ON DUPLICATE KEY UPDATE .
Kernel‑side options: (1) In ROW format, record a statement‑format log event for REPLACE INTO so the original SQL is preserved. (2) Record the REPLACE INTO as separate delete and insert events in ROW format.
Takeaways
(1) Changing autoinc_lock_mode or auto_increment_increment in production can easily lead to duplicate‑key problems; such parameters should remain stable.
(2) Effective on‑site analysis requires gathering the failure scenario, SQL statements, configuration, binlog format, and data snapshots. Accurate matching with known bugs is essential, but when no bug exists, a systematic investigation can still uncover the root cause.
Tencent Database Technology Team supports internal services such as WeChat Red Packets, lottery, and Data Bank, and provides cloud database products (CDB, CTSDB, CKV, CMongo) externally. The team focuses on enhancing database kernel features, improving performance, ensuring stability, and sharing production‑level problem‑solving knowledge.
Tencent Database Technology
Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.