Consider this code:
static void FillUsingAsNullable()
{
  int?[] arr = new int?[1 << 24];
  var sw = System.Diagnostics.Stopwatch.StartNew();
  for (int i = 0; i < arr.Length; ++i)
    arr[i] = GetObject() as int?;
  Console.WriteLine("{0:N0}", sw.ElapsedTicks);
}
static void FillUsingOwnCode()
{
  int?[] arr = new int?[1 << 24];
  var sw = System.Diagnostics.Stopwatch.StartNew();
  for (int i = 0; i < arr.Length; ++i)
  {
    object temporary = GetObject();
    arr[i] = temporary is int ? (int?)temporary : null;
  }
  Console.WriteLine("{0:N0}", sw.ElapsedTicks);
}
static object GetObject()
{
//Uncomment only one:
  //return new object();
  //return 42;
  //return null;
}
As far as I can see, the methods FillUsingAsNullable and FillUsingOwnCode should be equivalent.
But it looks like the "own code" version is clearly faster.
There are 2 choices for compiling "x86" or "x64", and 2 choices for compiling "Debug" or "Release (optimizations)", and 3 choices for what to return in GetObject method. As far as I can see, in all of these 2*2*3 == 12 cases, the "own code" version is significantly faster than the "as nullable" version.
The question: Is as with Nullable<> unnecessarily slow, or am I missing something here (quite likely)?
Related thread: Performance surprise with “as” and nullable types.
The generated IL is different, but not fundamentally. If the JIT was good, which it is not and this is no news, this could compile to the exact same x86 code.
I compiled this with VS2010 Release AnyCPU.
as version:
L_0015: call object ConsoleApplication3.Program::GetObject()
L_001a: stloc.3 
L_001b: ldloc.0 
L_001c: ldloc.2 
L_001d: ldelema [mscorlib]System.Nullable`1<int32>
L_0022: ldloc.3 
L_0023: isinst [mscorlib]System.Nullable`1<int32>
L_0028: unbox.any [mscorlib]System.Nullable`1<int32>
L_002d: stobj [mscorlib]System.Nullable`1<int32>
?: version:
L_0015: call object ConsoleApplication3.Program::GetObject()
L_001a: stloc.3 
L_001b: ldloc.0 
L_001c: ldloc.2 
L_001d: ldelema [mscorlib]System.Nullable`1<int32>
L_0022: ldloc.3 
L_0023: isinst int32
L_0028: brtrue.s L_0036 //**branch here**
L_002a: ldloca.s nullable
L_002c: initobj [mscorlib]System.Nullable`1<int32>
L_0032: ldloc.s nullable
L_0034: br.s L_003c
L_0036: ldloc.3 
L_0037: unbox.any [mscorlib]System.Nullable`1<int32>
L_003c: stobj [mscorlib]System.Nullable`1<int32>
The descriptions of the opcodes are on MSDN. Understanding this IL is not difficult and anyone can do it. It is a little time-consuming to the inexperienced eye, though.
The main difference is that the version with the branch in the source code also has a branch in the generated IL. It is just a little less elegant. The C# compiler could have optimized this out if it wanted to, but the policy of the team is to let the JIT worry about optimizations. Would work fine if the JIT was getting then necessary investments.
You could analyze this further by looking at the x86 emitted by the JIT. You'll find an obvious difference but it will be an unspectacular discovery. I will not invest the time to do that.
I modified the as version to use a temporary as well to have a fair comparison:
            var temporary = GetObject();
            arr[i] = temporary as int?;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With