6 Critical MySQL Pitfalls and How to Avoid Them
Learn how to prevent six frequent MySQL problems—including index misuse, transaction isolation anomalies, inefficient pagination, charset and collation errors, risky foreign-key cascades, and misconfigured connection pools—through detailed explanations, code examples, and practical mitigation strategies.
Introduction
MySQL is easy to start with, but as data grows many hidden performance and consistency problems appear. This summary collects six common MySQL pitfalls and provides concrete examples, deep analysis, and practical avoidance guidelines.
Pitfall 1 – Index Failure
Why indexes become ineffective
The optimizer may choose a full‑table scan when the cost of using an index is estimated higher, which happens in several typical patterns.
Typical scenarios
-- Create a test table with indexes
CREATE TABLE user (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50),
age INT,
email VARCHAR(100),
created_time DATETIME,
INDEX idx_name (name),
INDEX idx_age (age),
INDEX idx_created_time (created_time)
);
-- 1. Function on indexed column (wrong)
EXPLAIN SELECT * FROM user WHERE DATE(created_time) = '2023-01-01';
-- Correct: range query
EXPLAIN SELECT * FROM user WHERE created_time >= '2023-01-01 00:00:00'
AND created_time < '2023-01-02 00:00:00';
-- 2. Implicit type conversion (wrong)
EXPLAIN SELECT * FROM user WHERE name = 123;
-- Correct: match types
EXPLAIN SELECT * FROM user WHERE name = '123';
-- 3. Leading wildcard LIKE (wrong)
EXPLAIN SELECT * FROM user WHERE name LIKE '%三%';
-- Correct: non‑leading wildcard
EXPLAIN SELECT * FROM user WHERE name LIKE '苏%';
-- 4. OR condition mixing indexed and non‑indexed columns (wrong)
EXPLAIN SELECT * FROM user WHERE age = 25 OR email = '[email protected]';
-- Correct: rewrite with UNION
EXPLAIN SELECT * FROM user WHERE age = 25
UNION
SELECT * FROM user WHERE email = '[email protected]';Deep analysis
Function on indexed column – applying a function (e.g., DATE()) destroys the ordered nature of the B‑tree, forcing a full scan.
Implicit type conversion – MySQL casts the column (e.g., CAST(name AS SIGNED)) which disables the index.
Leading wildcard – B‑tree cannot use a prefix when the pattern starts with %, resulting in a full scan.
OR condition – mixing indexed and non‑indexed columns makes the optimizer drop the index; splitting the query with UNION or rewriting the predicate keeps index usage.
Avoidance guide
Avoid applying functions to indexed columns; use range predicates instead.
Make sure query literals match column data types.
Prefer patterns without a leading wildcard (e.g., LIKE 'prefix%').
Rewrite complex OR conditions as separate queries combined with UNION or use composite indexes that cover all predicates.
Pitfall 2 – Transaction Isolation and Phantom Reads
Why isolation level matters
Different isolation levels trade off consistency, performance, and concurrency. Choosing the wrong level can cause dirty reads, non‑repeatable reads, or phantom reads.
Example
-- Check current isolation level
SELECT @@transaction_isolation;
-- Set REPEATABLE‑READ (default)
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- Session A – start transaction and count rows
START TRANSACTION;
SELECT COUNT(*) FROM account WHERE user_id = 1001; -- returns 2
-- Session B – insert a new row and commit
START TRANSACTION;
INSERT INTO account (user_id, balance) VALUES (1001, 500);
COMMIT;
-- Session A continues – the previous SELECT still sees 2 rows
SELECT COUNT(*) FROM account WHERE user_id = 1001; -- still 2
UPDATE account SET balance = balance + 100 WHERE user_id = 1001; -- affects 3 rows
SELECT COUNT(*) FROM account WHERE user_id = 1001; -- now returns 3 (phantom)
COMMIT;Deep analysis
In REPEATABLE‑READ MySQL takes a snapshot for plain SELECT, so the INSERT performed by another transaction is invisible. UPDATE / DELETE see the latest committed rows, which can cause the count to change – a classic phantom read.
MySQL implements next‑key (gap) locks to prevent new rows from appearing in the range scanned by the transaction.
Avoidance guide
Understand the guarantees of each isolation level.
Use REPEATABLE READ for most workloads; switch to SERIALIZABLE only when absolute consistency is required.
Keep transactions short and avoid long‑running scans that hold gap locks.
Pitfall 3 – Pagination on Large Datasets
Why traditional OFFSET pagination slows down
Using LIMIT offset, size with a large offset forces MySQL to read and discard many rows, causing heavy I/O, back‑row lookups, and possible disk‑based sorting.
Example
-- Table with 10 million rows
CREATE TABLE `order` (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id INT,
amount DECIMAL(10,2),
status TINYINT,
created_time DATETIME,
INDEX idx_created_time (created_time)
);
-- Traditional pagination (slow when offset = 5 000 000)
EXPLAIN SELECT * FROM `order`
ORDER BY created_time DESC
LIMIT 5000000, 20;
-- Cursor (keyset) pagination – first page
SELECT * FROM `order`
ORDER BY created_time DESC, id DESC
LIMIT 20;
-- Subsequent page – use last row values
SELECT * FROM `order`
WHERE created_time < '2023-06-01 10:00:00'
OR (created_time = '2023-06-01 10:00:00' AND id < 1000000)
ORDER BY created_time DESC, id DESC
LIMIT 20;
-- Subquery optimization (when cursor pagination is not possible)
SELECT * FROM `order`
WHERE id >= (
SELECT id FROM `order`
ORDER BY created_time DESC
LIMIT 5000000, 1
)
ORDER BY created_time DESC
LIMIT 20;Deep analysis
Large offsets cause MySQL to scan and discard millions of rows – massive useless I/O.
If the covering index does not include all selected columns, each row requires a back‑table lookup.
Sorting a huge result set may spill to disk, further degrading performance.
Advantages of cursor (keyset) pagination
Directly jumps to the start position without scanning the preceding rows.
Leverages the index ordering, eliminating extra sort work.
Performance remains stable as the table grows.
Avoidance guide
Prefer cursor‑based (keyset) pagination using the last row’s timestamp or primary‑key.
If OFFSET must be used, combine it with a subquery that limits the scanned range.
Ensure the ORDER BY columns are indexed (preferably a composite index covering both columns).
Pitfall 4 – Charset and Collation Traps
Why charset matters
MySQL’s utf8 character set stores only up to three bytes per character, which cannot represent 4‑byte Unicode symbols such as emoji. The utf8mb4 charset is the true UTF‑8 implementation. Collations control case‑sensitivity and sorting order.
Example
-- Show current charset variables
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
-- Wrong: using utf8 (max 3 bytes) – emoji insertion fails
CREATE TABLE user_utf8 (
id INT PRIMARY KEY,
name VARCHAR(50) CHARACTER SET utf8
);
INSERT INTO user_utf8 VALUES (1, '张三😊'); -- error
-- Correct: use utf8mb4 and a suitable collation
CREATE TABLE user_utf8mb4 (
id INT PRIMARY KEY,
name VARCHAR(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
);
INSERT INTO user_utf8mb4 VALUES (1, '张三😊'); -- succeeds
-- Collation impact on case sensitivity
CREATE TABLE product (
id INT PRIMARY KEY,
name VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci
);
SELECT * FROM product WHERE name = 'apple'; -- matches Apple, APPLE
SELECT * FROM product WHERE name = BINARY 'apple'; -- case‑sensitive match onlyDeep analysis
utf8in MySQL is a “pseudo‑UTF‑8” limited to three bytes; it cannot store 4‑byte characters. utf8mb4 stores the full Unicode range and should be the default for new schemas.
Collations ending with _ci are case‑insensitive, _cs are case‑sensitive, and _bin performs binary comparison.
Avoidance guide
Define databases, tables, and columns with utf8mb4.
Choose collations that match required sorting and case‑sensitivity.
Keep charset and collation consistent across schema and client connections.
When migrating existing data, convert tables to utf8mb4 carefully (e.g.,
ALTER TABLE … CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;).
Pitfall 5 – Foreign Keys and Cascading Operations
Why foreign keys are a double‑edged sword
Foreign keys guarantee referential integrity but introduce additional locking, lock contention, and can cause accidental massive data loss when ON DELETE/UPDATE CASCADE is used.
Example
-- Parent table
CREATE TABLE department (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50) NOT NULL
);
-- Child table with cascade rules
CREATE TABLE employee (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50) NOT NULL,
department_id INT,
FOREIGN KEY (department_id) REFERENCES department(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
-- Dangerous cascade delete
DELETE FROM department WHERE id = 1; -- deletes all employees of department 1
-- Lock contention example
START TRANSACTION;
DELETE FROM department WHERE id = 1; -- holds lock on the parent row
-- In another session
START TRANSACTION;
INSERT INTO employee (name, department_id) VALUES ('New Employee', 1); -- blocked until the first transaction commits
COMMIT;Deep analysis
Foreign‑key checks acquire locks on both parent and child tables, expanding the lock scope.
Cascading deletes/updates can remove or modify large amounts of data unintentionally.
Complex foreign‑key graphs increase the risk of deadlocks.
Bulk data loads must respect dependency order; otherwise constraints fail.
Avoidance guide
In high‑concurrency systems, consider enforcing referential integrity at the application layer.
Avoid ON DELETE/UPDATE CASCADE; use explicit business logic or soft‑delete flags.
Temporarily disable foreign‑key checks ( SET FOREIGN_KEY_CHECKS=0) during bulk imports, then re‑enable.
Design schemas with minimal cascading relationships and keep foreign‑key graphs shallow.
Pitfall 6 – Connection‑Pool Misconfiguration
Why the pool is critical
Database connections are expensive to create. An ill‑configured pool leads to resource waste, connection exhaustion, and application crashes.
Example (Spring Boot Druid)
// DruidConfig.java
@Configuration
public class DruidConfig {
@Bean
@ConfigurationProperties("spring.datasource.druid")
public DataSource dataSource() {
return DruidDataSourceBuilder.create().build();
}
}
# application.yml (problematic values)
spring:
datasource:
druid:
initial-size: 50 # too many at startup
max-active: 20 # too few for load
min-idle: 5
max-wait: 3000
validation-query: SELECT 1
test-on-borrow: true
test-on-return: false
test-while-idle: true
time-between-eviction-runs-millis: 60000
min-evictable-idle-time-millis: 300000Deep analysis
initial-size : Large initial pool consumes resources even when idle.
max-active : Too low caps concurrent connections, causing thread blockage.
min-idle : Controls the number of warm‑up connections; too low may cause latency spikes.
max-wait : Short timeout can make threads wait excessively for a free connection.
Missing leak detection hides unclosed connections, eventually exhausting the pool.
Connection‑leak example
// Bad – connection never closed
public User getUser(int id) {
Connection conn = dataSource.getConnection(); // leak
// query …
return user;
}
// Good – try‑with‑resources ensures closure
public User getUserCorrect(int id) {
try (Connection conn = dataSource.getConnection();
PreparedStatement stmt = conn.prepareStatement("SELECT * FROM user WHERE id = ?")) {
stmt.setInt(1, id);
ResultSet rs = stmt.executeQuery();
// process result set …
return user;
} catch (SQLException e) {
throw new RuntimeException(e);
}
}Avoidance guide
Set initial-size, max-active, and min-idle according to expected concurrency and database capacity.
Enable leak detection (e.g., removeAbandoned=true in Druid) and always use try‑with‑resources or explicit close() calls.
Monitor pool metrics (active, idle, wait time) and configure alerts for abnormal spikes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
