Why IN/NOT IN Slow Down SQL Queries and Safer Alternatives
This article explains why using IN and NOT IN in SQL can lead to poor performance and incorrect results, demonstrates common pitfalls with examples, and shows how to replace them with EXISTS, NOT EXISTS, or JOIN constructs for reliable and faster queries.
Why?
IN and NOT IN are common SQL keywords, but they should be avoided when possible because they can cause severe performance degradation and produce unexpected results.
In a test with two tables t1 and t2, each containing 1.5 million rows (≈600 MB), the following query took an excessively long time:
select * from t1 where phone not in (select phone from t2)Even though indexes existed on the phone column in both tables, the NOT IN clause could not use the index, making the query extremely slow. Rewriting the query with NOT EXISTS reduced execution time to about 20 seconds.
select * from t1
where not EXISTS (select phone from t2 where t1.phone = t2.phone)Common Pitfalls with IN
Consider two simple tables:
create table test1 (id1 int);
create table test2 (id2 int);
insert into test1 (id1) values (1),(2),(3);
insert into test2 (id2) values (1),(2);Querying IDs that exist in test2 using IN works as expected:
select id1 from test1 where id1 in (select id2 from test2);The result (shown in the first image) is correct.
If the column name is mistyped:
select id1 from test1 where id1 in (select id1 from test2);SQL does not raise an error; it simply returns all rows from test1 (second image), which can be misleading.
Another dangerous case occurs when test2 contains a NULL value: insert into test2 (id2) values (NULL); The query
select id1 from test1 where id1 not in (select id2 from test2);returns an empty set (third image) because NULL is not equal to any value, causing the NOT IN predicate to evaluate to UNKNOWN for all rows.
How to Avoid These Issues
1. Replace IN/NOT IN with EXISTS/NOT EXISTS:
select * from test1 where EXISTS (select * from test2 where id2 = id1);
select * from test1 where NOT EXISTS (select * from test2 where id2 = id1);2. Use JOINs instead:
select id1 from test1
inner join test2 on id2 = id1;
select id1 from test1
left join test2 on id2 = id1
where id2 IS NULL;These approaches are index‑friendly and produce correct results even when NULL values are present.
Note: IN/NOT IN are still acceptable for small, fixed sets (e.g., IN (0,1,2)), but for large tables or when NULL may appear, prefer EXISTS or JOIN.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
