Image
{{'2016-06-04T12:09:00.0000000' | utcToLocalDate }}
Simon Sabin

When are things equal in this world and when are they not. The mystery of NULLs

I was interested reading a post on the MVP group about DISTINCT FROM by Erland who was championing the IS DISTINCT FROM clause but had found a work around using INTERSECT.

That led me to I dig into ansi and non ansi_null comparisons and what operators honour the setting and what ignore it.

 

To drive this I was doing a full join to find different rows between two tables. I always thought that setting ansi nulls would solve the problem of comparisons on A.Column = B.column i.e. you could just say

A.Column <> B.Column

rather than having to do something like

ISNULL(A.Column,<some default obscure value>) <> ISNULL(B.<the same obscure value>)

to ensure that NULL values on both side are treated as the same. However I find that is not the case, I found that what you want to do is Not (A.Column=B.Column), if you do this the optimiser changes this to A.Column <> B.Column which is not the same as Not (A.Column=B.Column), because if either are NULL then the result in both cases is unknown, which isn’t “True” and thus False. Subtlty of unknowns existing in an Boolean expression. Check out the table below. by effective result I mean if you use the expression in a CASE WHEN clause or a WHERE clause, it only really cares about TRUE thats what causes values to use the THEN clause, everything else uses the ELSE clause.

 

A B Result Effective Result NOT Result Effective Result
1 1 True True False False
2 1 False False True True
NULL NULL Unknown False Unknown False
1 NULL Unknown False Unknown False

With ANSI NULLS OFF

A B Result Effective Result NOT Result Effective Result
1 1 True True False False
2 1 False False True True
NULL NULL True False False True
1 NULL False True True False

That’s what one would like for ANSI NULLS but you don’t get that. It appears that you only get a behaviour change when you are comparing an expression with one or more variables on the left of right.

However if you are comparing columns directly it doesn’t honour the ANSI NULL setting unless you do use a case expression with like this

CASE <Expression> WHEN <result> THEN …..

This sort of makes the ANSI null setting a bit pointless and given the note in BOL  (https://msdn.microsoft.com/en-us/library/ms188048.aspx) about it

image

and if you dig a bit into that BOL page you will see that the behaviour needing an variable or a literal for ANSI NULLS to work is also in there

 

SET ANSI_NULLS ON affects a comparison only if one of the operands of the comparison is either a variable that is NULL or a literal NULL. If

both sides of the comparison are columns or compound expressions, the setting does not affect the comparison.

Summary

 

In summary NULLS suck, and whilst you can avoid them for your tables, when comparing tables using FULL JOINS or LEFT JOINS you can’t and in that sense if you use the INTERSECT method (where two columns with NULL are considered equal, go figure) then you will get the behaviour you want.

 

Paul’s blog goes into this in similar detail and into the different query plan operators

http://sqlblog.com/blogs/paul_white/archive/2011/06/22/undocumented-query-plans-equality-comparisons.aspx

 

The following script highlights the differences between the different comparisons.

 

       declare @null int =null

       set ansi_nulls off

 

       select case when A.Id <>B.Id then 0 else 1 end [A.Id <>B.Id]

               , isnull((select 0 where A.Id <>B.Id ),1) [A.Id <>B.IdSQ]

    

               , case when A.Id =B.Id then 1 else 0 end [A.Id =B.Id]

              , isnull((select 1 where A.Id =B.Id ),0) [A.Id =B.IdSQ]

   

              , case a.id when b.id then 1 else 0 end [a.id when b.id]

     

               , case when a.c <> b.c then 0 else 1 end [a.c <> b.c]

               , isnull((select 0 where A.c <>B.c ),1) [A.c <>B.cSQ]

    

               , case when a.c = b.c then 1 else 0 end [a.c = b.c]

               , isnull((select 1 where a.c = b.c ),0) [a.c = b.cSQ]

 

               , case a.c when b.c then 1 else 0 end [a.c when b.c]

    

               , case when a.c <> @null then 0 else 1 end [a.c <> @null]

               , isnull((select 0 where A.c <>@null ),1) [A.c <> @nullSQ]

    

               , case when a.c = @null then 1 else 0 end [a.c = @null]

               , isnull((select 1 where a.c = @null ),0) [a.c = @nullSQ]

 

               , case a.c when @null then 1 else 0 end [a.c when @null]

               , case when exists (select a.id, a.c intersect select b.id, b.c) then 1 else 0 end [intersect]

               , case when null = null then 1 else 0 end

               , case null when null then 1 else 0 end

               , A.Id, B.Id

               , A.c, B.c

       from (values(1,1),(2,null),(3,null),(4,1)) A(Id,c)

       full join (values(2,null),(3,1),(4,1),(5,null))B(id,c) on A.Id = B.Id

 

       go

       declare @null int =null

       set ansi_nulls on

       select case when A.Id <>B.Id then 0 else 1 end [A.Id <>B.Id]

               , isnull((select 0 where A.Id <>B.Id ),1) [A.Id <>B.IdSQ]

    

               , case when A.Id =B.Id then 1 else 0 end [A.Id =B.Id]

              , isnull((select 1 where A.Id =B.Id ),0) [A.Id =B.IdSQ]

   

              , case a.id when b.id then 1 else 0 end [a.id when b.id]

     

               , case when a.c <> b.c then 0 else 1 end [a.c <> b.c]

               , isnull((select 0 where A.c <>B.c ),1) [A.c <>B.cSQ]

    

               , case when a.c = b.c then 1 else 0 end [a.c = b.c]

               , isnull((select 1 where a.c = b.c ),0) [a.c = b.cSQ]

 

               , case a.c when b.c then 1 else 0 end [a.c when b.c]

    

               , case when a.c <> @null then 0 else 1 end [a.c <> @null]

               , isnull((select 0 where A.c <>@null ),1) [A.c <> @nullSQ]

    

               , case when a.c = @null then 1 else 0 end [a.c = @null]

               , isnull((select 1 where a.c = @null ),0) [a.c = @nullSQ]

 

               , case a.c when @null then 1 else 0 end [a.c when @null]

               , case when exists (select a.id, a.c intersect select b.id, b.c) then 1 else 0 end [intersect]

               , case when null = null then 1 else 0 end

               , case null when null then 1 else 0 end

               , A.Id, B.Id

               , A.c, B.c

       from (values(1,1),(2,null),(3,null),(4,1)) A(Id,c)

       full join (values(2,null),(3,1),(4,1),(5,null))B(id,c) on A.Id = B.Id

 

http://sqlblog.com/blogs/paul_white/archive/2011/06/22/undocumented-query-plans-equality-comparisons.aspx

comments powered by Disqus