Why IN/NOT IN Can Kill Your Query Performance and How to Fix It
This article explains why using IN and NOT IN in SQL queries often leads to poor performance and subtle bugs, especially with large tables and NULL values, and demonstrates safer alternatives such as EXISTS, NOT EXISTS, and JOIN with clear code examples.
IN and NOT IN are common SQL keywords, but they can cause serious performance problems and incorrect results.
1. Low efficiency
When two tables t1 and t2 each contain about 1.5 million rows, the following query runs for many minutes because NOT IN cannot use the index:
select * from t1 where phone not in (select phone from t2)Replacing NOT IN with NOT EXISTS reduces the execution time to about 20 seconds:
select * from t1 where not exists (select phone from t2 where t1.phone = t2.phone)2. Easy to produce wrong results
Using IN can silently return wrong data if the sub‑query references the wrong column. For example, the intended query is:
select id1 from test1 where id1 in (select id2 from test2)If the column name is mistyped as id1 in the sub‑query, the query still runs but returns incorrect rows without any error.
Another pitfall is NULL handling. Adding a NULL value to test2 makes the following NOT IN query return an empty set, because NULL does not compare equal to any non‑NULL value:
select id1 from test1 where id1 not in (select id2 from test2)Result: no rows are returned, even though id = 3 should be present.
Tip: avoid allowing NULLs in columns used for set comparisons.
Safer alternatives
Use EXISTS or NOT EXISTS instead of IN/NOT IN:
select * from test1 where exists (select * from test2 where id2 = id1) select * from test1 where not exists (select * from test2 where id2 = id1)Or replace the set operation with a JOIN:
select id1 from test1 inner join test2 on id2 = id1 select id1 from test1 left join test2 on id2 = id1 where id2 is nullThese approaches are usually index‑friendly and avoid the NULL pitfalls.
Note: IN can still be used safely when the set is a small, constant list such as IN (0, 1, 2).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
