Why IN and NOT IN Should Be Avoided in SQL Queries
Because IN and NOT IN often bypass indexes on large tables, can yield incorrect results when column names differ or NULL values are present, and may cause severe performance penalties, they should be replaced with EXISTS/NOT EXISTS subqueries or appropriate JOINs, reserving IN only for small literal lists.
The article explains why the SQL keywords IN and NOT IN are often inefficient and error‑prone.
Why?
When both tables contain millions of rows, a query like SELECT * FROM t1 WHERE phone NOT IN (SELECT phone FROM t2) cannot use indexes, leading to minutes‑long execution. Rewriting it with NOT EXISTS reduces the time to seconds.
Using IN can also produce wrong results. An example creates test1(id1) and test2(id2) , inserts data, and mistakenly writes SELECT id1 FROM test1 WHERE id1 IN (SELECT id1 FROM test2) . The query runs without error but returns unexpected rows because the column names differ.
Another pitfall appears when NULL values exist. SELECT id1 FROM test1 WHERE id1 NOT IN (SELECT id2 FROM test2) returns an empty set if test2 contains NULL , since NULL is not comparable to any value.
How?
Replace IN/NOT IN with:
EXISTS or NOT EXISTS subqueries.
JOINs (inner or left join with IS NULL check).
For small, fixed sets (e.g., IN (0,1,2) ) the IN clause remains acceptable.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.