Why IN/NOT IN Slow Down Queries and How to Replace Them
The article explains how using IN and NOT IN in SQL can dramatically hurt performance and produce incorrect results, especially with large tables or NULL values, and demonstrates safer alternatives such as EXISTS, NOT EXISTS, and JOIN with concrete code examples.
Why IN / NOT IN can be problematic
Both IN and NOT IN are common SQL predicates, but they may cause severe performance degradation and produce incorrect results when used with large tables or nullable columns.
1. Performance impact
When the two tables involved each contain about 1.5 million rows (≈600 MB), a query that uses NOT IN often cannot use indexes and may run for many minutes:
select *
from t1
where phone not in (select phone from t2);Replacing the predicate with NOT EXISTS allows the optimizer to use the index on phone and reduces execution time to seconds:
select *
from t1
where NOT EXISTS (
select 1
from t2
where t1.phone = t2.phone
);2. Logical pitfalls
Column‑mismatch silently succeeds . Consider the tables:
create table test1 (id1 int);
create table test2 (id2 int);
insert into test1 (id1) values (1),(2),(3);
insert into test2 (id2) values (1),(2);Correct usage:
select id1
from test1
where id1 in (select id2 from test2);If the subquery mistakenly references id1 (a column that does not exist in test2), SQL Server treats the subquery as a constant list and returns all rows without raising an error:
select id1
from test1
where id1 in (select id1 from test2);NULL values turn NOT IN into an empty result . After inserting a NULL into test2: insert into test2 (id2) values (NULL); the following query returns no rows, because the comparison value NOT IN (NULL, …) evaluates to UNKNOWN for every row:
select id1
from test1
where id1 not in (select id2 from test2);Avoid allowing NULL in columns that participate in set comparisons.
Recommended alternatives
Use EXISTS / NOT EXISTS
These predicates are index‑friendly and handle NULL correctly:
-- Rows that have a matching value in test2
select *
from test1
where EXISTS (
select 1
from test2
where test2.id2 = test1.id1
);
-- Rows that have no matching value in test2
select *
from test1
where NOT EXISTS (
select 1
from test2
where test2.id2 = test1.id1
);Use JOINs
Inner joins return the intersection, while a left join combined with a NULL filter yields the anti‑join (rows missing in the second table):
-- Intersection (equivalent to IN)
select test1.id1
from test1
inner join test2 on test2.id2 = test1.id1;
-- Anti‑join (equivalent to NOT IN, safe with NULLs)
select test1.id1
from test1
left join test2 on test2.id2 = test1.id1
where test2.id2 IS NULL;These constructs are generally preferred for large tables or when nullable columns are involved. IN remains useful for small, fixed literal sets such as IN (0,1,2).
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
