Why IN and NOT IN Can Kill Your SQL Performance and How to Fix It
This article explains why using IN and NOT IN in SQL queries often leads to poor performance and incorrect results, demonstrates common pitfalls with examples—including NULL handling—and shows safer alternatives such as EXISTS, NOT EXISTS, and JOINs.
Why avoid IN and NOT IN?
Both IN and NOT IN are frequently used keywords, but they should be avoided when possible because they can cause severe performance degradation and subtle logical errors.
Example with two large tables ( t1 and t2) each containing about 1.5 million rows (≈600 MB):
select * from t1 where phone not in (select phone from t2)The query runs for many minutes even though indexes exist on phone in both tables, because NOT IN cannot use the index. Rewriting the same logic with NOT EXISTS reduces the execution time to about 20 seconds.
select * from t1 where not EXISTS (select phone from t2 where t1.phone = t2.phone)Common pitfalls with IN
Consider two small tables test1 (ids 1,2,3) and test2 (ids 1,2). The intended query to find ids present in both tables is:
select id1 from test1 where id1 in (select id2 from test2)This works correctly and returns 1, 2. However, a typo that changes id2 to id1 in the sub‑query produces:
select id1 from test1 where id1 in (select id1 from test2)Even though id1 does not exist in test2, the query silently returns 1, 2, 3 without raising an error, because the IN sub‑query does not validate column existence.
Another dangerous case appears when NULL values are involved. After inserting a NULL into test2: insert into test2 (id2) values (NULL); Running the same NOT IN query yields an empty result set, because any comparison with NULL evaluates to UNKNOWN, so rows that should be returned are filtered out.
How to write safe and efficient queries
Replace IN / NOT IN with EXISTS / NOT EXISTS or appropriate JOIN constructs.
select * from test1 where EXISTS (select * from test2 where id2 = id1) select * from test1 where NOT EXISTS (select * from test2 where id2 = id1)Using joins:
select id1 from test1 inner join test2 on id2 = id1 select id1 from test1 left join test2 on id2 = id1 where id2 IS NULLThese alternatives are index‑friendly and avoid the logical pitfalls of IN / NOT IN. Nevertheless, IN can still be useful for small, fixed sets such as IN (0,1,2).
Architect's Tech Stack
Java backend, microservices, distributed systems, containerized programming, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
