Why IN and NOT IN Slow Down Your Queries – Pitfalls and Faster Alternatives
This article explains why the SQL IN and NOT IN operators often cause performance bottlenecks and subtle bugs, demonstrates real‑world examples with large tables, shows how NULL values can produce wrong results, and recommends using EXISTS, NOT EXISTS or JOINs for reliable and efficient queries.
Why Avoid IN/NOT IN?
IN and NOT IN are common keywords, but they should be used sparingly because they lead to low efficiency and easy mistakes.
Performance Issue Example
Two tables t1 and t2 each contain about 1.5 million rows (≈600 MB). The following query using NOT IN took more than ten minutes, even though both phone columns are indexed:
select * from t1 where phone not in (select phone from t2)Replacing it with NOT EXISTS reduced the execution time to about 20 seconds.
select * from t1 where not exists (select phone from t2 where t1.phone = t2.phone)Easy Mistake with IN
Creating two simple tables:
create table test1 (id1 int);
create table test2 (id2 int);
insert into test1 (id1) values (1),(2),(3);
insert into test2 (id2) values (1),(2);Correct query to find ids in test1 that exist in test2:
select id1 from test1 where id1 in (select id2 from test2);If the inner column is mistakenly written as id1 instead of id2, the query runs without error but returns the wrong result (1, 2, 3).
select id1 from test1 where id1 in (select id1 from test2);Because the subquery returns the same values, the condition is always true.
NULL Pitfall with NOT IN
After inserting a NULL into test2: insert into test2 (id2) values (NULL); The query to find ids in test1 that are not in test2 returns an empty set:
select id1 from test1 where id1 not in (select id2 from test2);This happens because NULL does not compare equal to any value, causing the NOT IN predicate to evaluate to UNKNOWN for all rows.
How to Fix It
1. Use EXISTS or NOT EXISTS instead of IN / NOT IN:
select * from test1 where exists (select * from test2 where test2.id2 = test1.id1);
select * from test1 where not exists (select * from test2 where test2.id2 = test1.id1);2. Use JOIN (inner or left) to achieve the same logic:
select id1 from test1 inner join test2 on test2.id2 = test1.id1;
select id1 from test1 left join test2 on test2.id2 = test1.id1 where test2.id2 is null;These approaches are reliable and usually faster.
Note: IN can still be useful for small, fixed value sets such as IN (0,1,2).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
