Using IN vs EXISTS in SQL: When a Small Table Drives a Large Table
This article explains why a small table should drive a large table in SQL queries, demonstrates data preparation, compares IN and EXISTS performance when the small table is either the inner or outer query, and concludes with practical guidelines for choosing between IN and EXISTS.
The article starts by explaining the principle of "small table driving large table" in SQL: when a small loop is placed on the outer side of a nested loop, the database connection is opened only a few times (e.g., 5 times for 5 outer iterations), reducing resource consumption compared to placing the large loop outside.
Data preparation involves inserting 100 rows into tb_dept_bigdata (the small table) and 5,000 rows into tb_emp_bigdata (the large table), representing 100 departments and 5,000 employees.
Case 1: B table (small) < A table (large) – use IN.
select * from tb_emp_bigdata A where A.deptno in (select B.deptno from tb_dept_bigdata B)The execution time for the IN query is shown in the accompanying chart (small table in subquery). Converting the same logic to EXISTS:
select * from tb_emp_bigdata A where exists (select 1 from tb_dept_bigdata B where B.deptno = A.deptno)The EXISTS query’s execution time is also displayed; with the current modest data size the difference is minor, but IN is slightly faster when the subquery returns the smaller set.
Case 2: A table (small) < B table (large) – use EXISTS.
select * from tb_dept_bigdata A where A.deptno in (select B.deptno from tb_emp_bigdata B)The IN query’s execution time is shown. Converting to EXISTS:
select * from tb_dept_bigdata A where exists (select 1 from tb_emp_bigdata B where B.deptno = A.deptno)The EXISTS version performs better when the subquery yields a larger result set, confirming the guideline.
Conclusion
When the subquery (the table after IN) is small, use IN; when the subquery is large, use EXISTS. A simple mnemonic is "IN small, EXISTS large." The article also provides a generic EXISTS pattern:
select ..... from table where exists (subquery);This pattern can be understood as placing the main query’s rows into the subquery for a boolean condition check, retaining rows only when the condition evaluates to true.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.