Why MySQL Indexes Fail: 9 Common Pitfalls and How to Fix Them
This article enumerates the most frequent situations that cause MySQL indexes to become ineffective—such as violating the left‑most rule, using functions or expressions on indexed columns, type conversions, improper LIKE patterns, OR/IN/NOT IN usage, and inequality operators—and demonstrates how to reproduce and verify each case with sample data and SQL scripts.
In the author’s previous article "Building High‑Performance Indexes (Strategy)" the design of high‑quality indexes was discussed; this follow‑up enumerates every scenario that can cause index failure, helping developers avoid common pitfalls.
2.1 Data Preparation
1. Create two tables: a department table and an employee table.
/* Department table, drop if exists */
DROP TABLE IF EXISTS dep;
CREATE TABLE dep(
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
depno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
depname VARCHAR(20) NOT NULL DEFAULT "",
memo VARCHAR(200) NOT NULL DEFAULT ""
);
/* Employee table, drop if exists */
DROP TABLE IF EXISTS emp;
CREATE TABLE emp(
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
empno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
empname VARCHAR(20) NOT NULL DEFAULT "",
job VARCHAR(9) NOT NULL DEFAULT "",
mgr MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
hiredate DATETIME NOT NULL,
sal DECIMAL(7,2) NOT NULL,
comn DECIMAL(7,2) NOT NULL,
depno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0
);2. Create two functions: one to generate a random string and another to generate a random number.
/* Function to generate a random string */
DELIMITER $
DROP FUNCTION IF EXISTS rand_string;
CREATE FUNCTION rand_string(n INT) RETURNS VARCHAR(255)
BEGIN
DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmlopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
DECLARE return_str VARCHAR(255) DEFAULT '';
DECLARE i INT DEFAULT 0;
WHILE i < n DO
SET return_str = CONCAT(return_str, SUBSTRING(chars_str, FLOOR(1+RAND()*52), 1));
SET i = i + 1;
END WHILE;
RETURN return_str;
END $
DELIMITER ;
/* Function to generate a random department number */
DELIMITER $
DROP FUNCTION IF EXISTS rand_num;
CREATE FUNCTION rand_num() RETURNS INT(5)
BEGIN
DECLARE i INT DEFAULT 0;
SET i = FLOOR(RAND()*100 + RAND()*20);
RETURN i;
END $
DELIMITER ;3. Write a stored procedure to insert 5 million employee rows.
/* Procedure to bulk‑insert employee data */
DELIMITER $
DROP PROCEDURE IF EXISTS insert_emp;
CREATE PROCEDURE insert_emp(IN START INT(10), IN max_num INT(10))
BEGIN
DECLARE i INT DEFAULT 0;
SET autocommit = 0;
REPEAT
SET i = i + 1;
INSERT INTO emp(empno, empname, job, mgr, hiredate, sal, comn, depno)
VALUES (START+i, rand_string(6), 'SALEMAN', 0001, NOW(), 2000, 400, rand_num());
UNTIL i = max_num END REPEAT;
COMMIT;
END $
DELIMITER ;
CALL insert_emp(0, 5000000);4. Write a stored procedure to insert 120 department rows.
/* Procedure to bulk‑insert department data */
DELIMITER $
DROP PROCEDURE IF EXISTS insert_dept;
CREATE PROCEDURE insert_dept(IN START INT(10), IN max_num INT(10))
BEGIN
DECLARE i INT DEFAULT 0;
SET autocommit = 0;
REPEAT
SET i = i + 1;
INSERT INTO dep(depno, depname, memo)
VALUES (START+i, rand_string(10), rand_string(8));
UNTIL i = max_num END REPEAT;
COMMIT;
END $
DELIMITER ;
CALL insert_dept(0, 120);5. After the data is loaded, create a composite index on depno, empname and job:
CREATE INDEX idx_depno_empname_job ON emp(depno, empname, job);2.2 Validation Process
In MySQL, a well‑designed index (usually a B+‑tree) dramatically speeds up row retrieval. The default engine InnoDB is used for all examples.
2.2.1 Violation of the Left‑Most Matching Principle
When a composite index is built on (empname, depno, job), MySQL can only use the index if the query predicates start with the left‑most column. Queries that omit empname (e.g., WHERE depno=7 AND job=8) cannot fully benefit from the index.
2.2.2 Using Functions on Indexed Columns
Applying a function to an indexed column (e.g., WHERE ABS(id)=4990001) prevents the optimizer from using the index, resulting in a full table scan.
2.2.3 Expressions Involving Indexed Columns
Expressions such as WHERE id+1=4990001 also invalidate the index because the column participates in a calculation.
2.2.4 Type Conversion (Implicit or Explicit)
Comparing a string column with a numeric constant forces MySQL to convert the string to a number, which disables the index (e.g., WHERE empname=1).
2.2.5 LIKE with a Leading %
A pattern like WHERE empname LIKE '%abc%' cannot use the index, while WHERE empname LIKE 'abc%' can.
2.2.6 OR Conditions with Non‑Indexed Columns
Using OR where at least one side references a column not covered by the index forces a full scan (e.g., WHERE depno=1 OR mgr=1).
2.2.7 IN Clause
When the IN list selects a small fraction of rows (e.g., 100 IDs out of 5 million), the index is used efficiently; however, if the result set exceeds roughly 30 % of the table, the optimizer may prefer a full scan.
2.2.8 NOT IN / NOT EXISTS
The same 30 % rule applies to NOT IN and NOT EXISTS —large result sets cause index loss.
2.2.9 Inequality Operators (!= or <>)
Using inequality operators forces MySQL to examine all rows, effectively disabling the index.
Summary
Common index‑invalidating scenarios include:
Violating the left‑most matching principle
Applying functions to indexed columns
Using expressions that involve indexed columns
Type conversion on indexed columns
LIKE patterns with a leading %
OR conditions where one side is not indexed
IN/NOT IN clauses that return more than 30 % of rows
Inequality operators (!= or <>)
Further articles will discuss index covering and ways to avoid back‑table lookups.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
