Why IN/NOT IN Can Kill Your SQL Performance and How to Fix It
This article explains why using IN and NOT IN in SQL queries often leads to poor performance and incorrect results, demonstrates common pitfalls with real examples, and shows safer alternatives such as EXISTS, NOT EXISTS, and JOIN to write efficient and reliable database queries.
WHY?
IN and NOT IN are frequently used keywords, but they should be avoided whenever possible because they can cause severe performance degradation and produce incorrect query results.
1. Efficiency loss
Consider two tables t1 and t2, each containing about 1.5 million rows (≈600 MB). The following query runs for many minutes:
select * from t1 where phone not in (select phone from t2)The reason is that NOT IN cannot use the index on phone, so the database performs a full scan. Rewriting the query with NOT EXISTS reduces the execution time to about 20 seconds:
select * from t1 where NOT EXISTS (select phone from t2 where t1.phone = t2.phone)2. Easy to produce wrong results
Using IN can lead to subtle bugs. For example, with tables test1(id1 int) and test2(id2 int):
create table test1 (id1 int);
create table test2 (id2 int);
insert into test1 (id1) values (1),(2),(3);
insert into test2 (id2) values (1),(2);The intended query is to find id1 values that exist in test2:
select id1 from test1 where id1 in (select id2 from test2);This works, but a typo such as using id1 instead of id2 in the sub‑query silently returns the wrong rows (1, 2, 3) without raising an error, as shown in the screenshot:
Another pitfall appears when the sub‑query contains NULL values. Adding a NULL row to test2 and running:
select id1 from test1 where id1 not in (select id2 from test2);produces an empty result set because NULL is not comparable to any non‑null value:
The lesson is that NULL breaks the logic of NOT IN, leading to unexpected empty results.
HOW?
1. Use EXISTS or NOT EXISTS
select * from test1 where EXISTS (select * from test2 where id2 = id1);
select * from test1 where NOT EXISTS (select * from test2 where id2 = id1);2. Use JOIN
select id1 from test1 INNER JOIN test2 ON id2 = id1;
select id1 from test1 LEFT JOIN test2 ON id2 = id1 where id2 IS NULL;These alternatives avoid the index‑bypass problem of IN/NOT IN and handle NULL values correctly.
Author: Hydor Source: cnblogs.com/hydor/p/5391556.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
