Compiler trivia: const, operators and being nice to the compiler

This is a question that came up on our internal alias. I thought it might be generally interesting to illustrate how the compiler picks operators.

Here is the original issue. This code compiles fine:

UInt64 vUInt641 = UInt64.MaxValue;
const int  vInt2 = 1432765098;
int res = (int)(vUInt641 – vInt2);

But this code generates a compile error:

UInt64 vUInt641 = UInt64.MaxValue;
int  vInt2 = 1432765098;
int res = (int)(vUInt641 – vInt2);

(line 3): error CS0019: Operator ‘-‘ cannot be applied to operands of type ‘ulong’ and ‘int’

 The only difference between the two pieces of code is the presence of the const keyword in the first one. Let’s first analyze the second case. The reason an error is generated is that there is no ‘-‘ operator defined between an ulong and an int. There is also no implicit conversion between int and and ulong or the other way around. The compiler has to give up and to produce an error.

In the first case the variable is marked as const, which means that the compiler knows its value at compile time. It realizes that the value is positive and can safely been converted to an ulong. The compiler converts it and then invokes the “-(ulong, ulong)” operator.

A bizarre way to think of it is this. As you have been nice to the compiler by telling him that you are not going to modify this value, the compiler then is nice to you by making use of the info to help you out in this case …

Remember, always be nice to the compiler, the more you tell him, the more he tells you …

Advertisements

Nullable trilogy part III: Nullable as type parameter in a generic class

Another commonly asked question relates to the behavior of Nullable<T> when used as type parameter to instantiate a generic class. It might be surprising that comparing such a parameter to null gives always false as a result. As it turns out, this is not related to Nullable<T>, but it is a result of how generics are implemented. There is a tendency to think about generics in a manner very similar to C++ templates, this view is unfortunately not correct.

Generics have a runtime representation and, as such, the compiler needs to generate IL that works whatever type is used at runtime to instantiate the generic class. This means that the compiler cannot call user defined operators, because at compile time it has no knowledge about them. The compiler doesn’t know at that point which type will be used to instantiate the generic class in the future. It cannot know that it has to generate code to call these user-defined operators.

In the same vein the compiler doesn’t know that a generic class will be instantiated with a Nullable<T> parameter and so it cannot produce IL to lift operators (i.e. the equality operator ‘==’ ). The result is that when the programmer writes code like ‘t==null’, the compiler generates IL to call the ‘standard’ ‘==’ operator, which in the case of Nullable<T> returns false because t has the runtime type of struct.

A similar behavior is observable with the following code using strings:

string s1 = “Bob John”;

string s2 = “Bob”;

Console.WriteLine(Equals(s1, s2 + ” John”));

static bool Equals<T>(T a, T b) where T:class {

return a == b;

}

This code would return false, because the ‘==’ operator for reference types gets invoked.

A case could be made that the compiler should generate IL to check the runtime type of the generic parameter and call the correct operator for at least some well-known types. This solution wouldn’t work for user defined operators as the compiler doesn’t know at compile time the set of types that could be used to instantiate a generic class. In general we don’t like to keep this sort of ‘lists of special things’ in the codebase, More importantly the solution would impose a somehow significant performance penalty at each use of the operator. This feeling of ‘hacking’ things and the significant performance problem convinced us not to implement this solution.