I do not prefer to use "Not In" with my SQL statements. I usually re-write "NOT IN" with "OUTER JOIN" or "NOT EXISTS". Many people have asked me the reason for not using "Not In". With this post, I am trying to explain when NOT IN can be little tricky
There are mainly following two concerns of not using "Not In"
1. NOT IN can be very expensive in terms of performance.
2. NOT IN can give us unexpected results with NULLs, if the developer is not careful.
So without further delays, let's dive into the example with my favorite tables EMP and DEPT.
Here is the dataset, as you can see here EMP table does not contain any record for (DEPTNO 40, DNAME OPERATIONS).
Now let us run a NOT IN SQL to get the DEPT which is NOT IN EMP table.
Something is wrong here. It should have returned (DEPTNO 40, DNAME OPERATIONS) record.
Let me rewrite the SQL with OUTER JOIN.
Outer Join seems to be working fine. Let us run another SQL to perform the same operation with NOT EXISTS
NOT EXISTS also worked.
The issue is NOT IN gets little tricky when there is NULL in data. NULL is not "equal" or not "not equal" to any value. NULL is simply NULL and signifies empty or no presence of any value in the column. This can be simply delat with NVL.
Please do post your comments, if you find this post of any value. Enjoy writing SQLs.
Related Posts:
- Query Optimization - Not In
- Oracle: using OUTER JOIN in place of NOT IN
- Query Optimization - Handling NULL Values
- Why prefer COALESCE over NVL
- Oracle: Outer Joins - Old and New Syntax
- Query optimization tips for Oracle
- Order By and Null values in Oracle Database
- Why my distinct query is not using index?
There are mainly following two concerns of not using "Not In"
1. NOT IN can be very expensive in terms of performance.
2. NOT IN can give us unexpected results with NULLs, if the developer is not careful.
So without further delays, let's dive into the example with my favorite tables EMP and DEPT.
Here is the dataset, as you can see here EMP table does not contain any record for (DEPTNO 40, DNAME OPERATIONS).
nimish@garg> select d.deptno, d.dname, count(*)
2 from emp e, dept d
3 where e.deptno = d.deptno
4 group by d.deptno, d.dname;
DEPTNO DNAME COUNT(*)
---------- -------------- ----------
10 ACCOUNTING 2
20 RESEARCH 5
30 SALES 6
nimish@garg> select * from dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Now let us run a NOT IN SQL to get the DEPT which is NOT IN EMP table.
nimish@garg> select * from dept
2 where deptno not in (select deptno from emp);
no rows selected
Something is wrong here. It should have returned (DEPTNO 40, DNAME OPERATIONS) record.
Let me rewrite the SQL with OUTER JOIN.
nimish@garg> select d.* from dept d, emp e
2 where d.deptno = e.deptno (+)
3 and e.deptno is null;
DEPTNO DNAME LOC
---------- -------------- -------------
40 OPERATIONS BOSTON
Outer Join seems to be working fine. Let us run another SQL to perform the same operation with NOT EXISTS
nimish@garg> select * from dept d where not exists
2 (select 1 from emp e where e.deptno = d.deptno);
DEPTNO DNAME LOC
---------- -------------- -------------
40 OPERATIONS BOSTON
NOT EXISTS also worked.
The issue is NOT IN gets little tricky when there is NULL in data. NULL is not "equal" or not "not equal" to any value. NULL is simply NULL and signifies empty or no presence of any value in the column. This can be simply delat with NVL.
nimish@garg> select * from dept
2 where deptno not in (select nvl(deptno,-999) from emp);
DEPTNO DNAME LOC
---------- -------------- -------------
40 OPERATIONS BOSTON
Please do post your comments, if you find this post of any value. Enjoy writing SQLs.
Related Posts:
- Query Optimization - Not In
- Oracle: using OUTER JOIN in place of NOT IN
- Query Optimization - Handling NULL Values
- Why prefer COALESCE over NVL
- Oracle: Outer Joins - Old and New Syntax
- Query optimization tips for Oracle
- Order By and Null values in Oracle Database
- Why my distinct query is not using index?
nice explanation.
ReplyDeleteThanks for the interesting article.
ReplyDeleteIn fairness, I haven't seen this join syntax in about ten years now :)
Good point - null values quite often are "forgotten". easy to trap yourself...
ReplyDelete"NOT IN" over NULLs is not that tricky if you understand and always keep in mind Null-Aware-Anti-Join.
ReplyDeleteI agree,Nvl and is not null supposed to be avoided as these are not optimal solutions. What if I have any value which correspond to zero. Not exists is safe solution.
ReplyDeleteTrue ;; in and exists operator bevave differently than = operator for null values . When performing group by that also looks different
ReplyDelete