Nullable trilogy part II: a == b -> a>=b && a

A question that often comes up when we discuss Nullable<T>  is about the anti-symmetric property. This property states that if a==b then a>=b and a<=b. If a and b are null then this property is not satisfied in the current design as the result of >= and <= is always false when one of the parameters is null. This may seems surprising, but it can be easily understood when considering if null can be ordered.

We decided that null being less of (or more of) all the other elements in the domain would be an arbitrary decision and as such null cannot be ordered. If it cannot be ordered then the result of the relational operators for null is false as these operators can be defined as:

a<b -> ordered(a,b) && a<b

a<=b -> ordered(a,b) && (a<b || a==b)

It might be argued that for practical reasons it is convenient to have the <= and >= operators return true in case null is on both sides, but this choice has some practical drawbacks. Consider the following code:

void ProcessTransactions(int?[] transactions, int? maxValue) {

foreach(int? t in transactions)

if(t < maxValue) ProcessTransaction(t);

}

This code does what the programmer expects, even when maxValue is null (in which case it won’t process any element). Now let’s suppose that the programmer changes his mind and wants to include maxValue in the processing. He will then change the ‘<’ operator to be ‘<=’.

Under the current design he will obtain what he expects, even in the case maxValue is null. By considering null >= null as true, this function would suddenly start processing all the null transactions, which is not what the programmer intended. The problem here is that people tend to think about the ‘>=’ and ‘<=’ operators as a way to include the limit value in the set of things to be processed. Very rarely, they intend to process the null value as well. Another way to think about it is that: deciding that null cannot be ordered implies that >= has to return false to be conceptually consistent and to prevent these quite subtle bugs to occur.

Advertisements

Nullable trilogy Part I: why not just SQL?

This is the first of a weekly three part serie of posts about Nullable<T>. In these posts I want to describe the reasons behind three design choices:
1. Why not just use SQL semantics for null?
2. Why null == null doesn’t imply null >= null and null <= null?
3. Why inside a generic class with a type parameter t the expression t == null will return false, when t is a nullable type and the value of it is null.

Let’s start from the first question as the answer is shorter. We’ll get to the other two in the coming weeks.

The first question relates to the reason not to have the same semantics as SQL for relational operators. The SQL semantics have been commonly referred to as three-value logic where null == null returns null. Introducing such logic in the C# language would be problematic. The main reason is that the language already contains the concept of null for reference types and it does have the programming languages traditional two-value logic where null == null returns true.

Granted that we cannot change this definition, then the addition of three-value logic just for some types would be confusing. We would need, for example, to create a new NullableString class to be able to apply three-value logic operators to it. More generally, the presence in the same code of two value logic and three value logic operators would make the code quite difficult to write, read and maintain.