NOT IN clause with NULL is tricky in Oracle Database

I do not prefer to use "Not In" with my SQL statements. I usually re-write "NOT IN" with "OUTER JOIN" or "NOT EXISTS". Many people have asked me the reason for not using "Not In". With this post, I am trying to explain when NOT IN can be little tricky

There are mainly following two concerns of not using "Not In"
1. NOT IN can be very expensive in terms of performance.
2. NOT IN can give us unexpected results with NULLs, if the developer is not careful.

So without further delays, let's dive into the example with my favorite tables EMP and DEPT.

Here is the dataset, as you can see here EMP table does not contain any record for (DEPTNO 40, DNAME OPERATIONS).
nimish@garg> select d.deptno, d.dname, count(*)
  2  from emp e, dept d
  3  where e.deptno = d.deptno
  4  group by d.deptno, d.dname;

    DEPTNO DNAME            COUNT(*)
---------- -------------- ----------
        10 ACCOUNTING              2
        20 RESEARCH                5
        30 SALES                   6

nimish@garg> select * from dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

Now let us run a NOT IN SQL to get the DEPT which is NOT IN EMP table.
nimish@garg> select * from dept
  2  where deptno not in (select deptno from emp);

no rows selected

Something is wrong here. It should have returned (DEPTNO 40, DNAME OPERATIONS) record.

Let me rewrite the SQL with OUTER JOIN.
nimish@garg> select d.* from dept d, emp e
  2  where d.deptno = e.deptno (+)
  3  and e.deptno is null;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        40 OPERATIONS     BOSTON

Outer Join seems to be working fine. Let us run another SQL to perform the same operation with NOT EXISTS

nimish@garg> select * from dept d where not exists
  2  (select 1 from emp e where e.deptno = d.deptno);

    DEPTNO DNAME          LOC
---------- -------------- -------------
        40 OPERATIONS     BOSTON


NOT EXISTS also worked.
The issue is NOT IN gets little tricky when there is NULL in data. NULL is not "equal" or not "not equal" to any value. NULL is simply NULL and signifies empty or no presence of any value in the column. This can be simply delat with NVL.

nimish@garg> select * from dept
  2  where deptno not in (select nvl(deptno,-999) from emp);

    DEPTNO DNAME          LOC
---------- -------------- -------------
        40 OPERATIONS     BOSTON

Please do post your comments, if you find this post of any value. Enjoy writing SQLs.

Related Posts:
- Query Optimization - Not In
- Oracle: using OUTER JOIN in place of NOT IN
- Query Optimization - Handling NULL Values
- Why prefer COALESCE over NVL
- Oracle: Outer Joins - Old and New Syntax
- Query optimization tips for Oracle
- Order By and Null values in Oracle Database
- Why my distinct query is not using index?

6 comments:

  1. Thanks for the interesting article.
    In fairness, I haven't seen this join syntax in about ten years now :)

    ReplyDelete
  2. Good point - null values quite often are "forgotten". easy to trap yourself...

    ReplyDelete
  3. "NOT IN" over NULLs is not that tricky if you understand and always keep in mind Null-Aware-Anti-Join.

    ReplyDelete
  4. I agree,Nvl and is not null supposed to be avoided as these are not optimal solutions. What if I have any value which correspond to zero. Not exists is safe solution.

    ReplyDelete
  5. True ;; in and exists operator bevave differently than = operator for null values . When performing group by that also looks different

    ReplyDelete