This benchmark appears to show that calling a virtual method directly on object reference is faster than calling it on the reference to the interface this object implements.
In other words:
interface IFoo {
    void Bar();
}
class Foo : IFoo {
    public virtual void Bar() {}
}
void Benchmark() {
    Foo f = new Foo();
    IFoo f2 = f;
    f.Bar(); // This is faster.
    f2.Bar();    
}
Coming from the C++ world, I would have expected that both of these calls would be implemented identically (as a simple virtual table lookup) and have the same performance. How does C# implement virtual calls and what is this "extra" work that apparently gets done when calling through an interface?
OK, answers/comments I got so far imply that there is a double-pointer-dereference for virtual call through interface versus just one dereference for virtual call through object.
So could please somebody explain why is that necessary? What is the structure of the virtual table in C#? Is it "flat" (as is typical for C++) or not? What were the design tradeoffs that were made in C# language design that lead to this? I'm not saying this is a "bad" design, I'm simply curious as to why it was necessary.
In a nutshell, I'd like to understand what my tool does under the hood so I can use it more effectively. And I would appreciate if I didn't get any more "you shouldn't know that" or "use another language" types of answers.
Just to make it clear we are not dealing with some compiler of JIT optimization here that removes the dynamic dispatch: I modified the benchmark mentioned in the original question to instantiate one class or the other randomly at run-time. Since the instantiation happens after compilation and after assembly loading/JITing, there is no way to avoid dynamic dispatch in both cases:
interface IFoo {
    void Bar();
}
class Foo : IFoo {
    public virtual void Bar() {
    }
}
class Foo2 : Foo {
    public override void Bar() {
    }
}
class Program {
    static Foo GetFoo() {
        if ((new Random()).Next(2) % 2 == 0)
            return new Foo();
        return new Foo2();
    }
    static void Main(string[] args) {
        var f = GetFoo();
        IFoo f2 = f;
        Console.WriteLine(f.GetType());
        // JIT warm-up
        f.Bar();
        f2.Bar();
        int N = 10000000;
        Stopwatch sw = new Stopwatch();
        sw.Start();
        for (int i = 0; i < N; i++) {
            f.Bar();
        }
        sw.Stop();
        Console.WriteLine("Direct call: {0:F2}", sw.Elapsed.TotalMilliseconds);
        sw.Reset();
        sw.Start();
        for (int i = 0; i < N; i++) {
            f2.Bar();
        }
        sw.Stop();
        Console.WriteLine("Through interface: {0:F2}", sw.Elapsed.TotalMilliseconds);
        // Results:
        // Direct call: 24.19
        // Through interface: 40.18
    }
}
If anyone is interested, here is how my Visual C++ 2010 lays out an instance of a class that multiply-inherits other classes:
Code:
class IA {
public:
    virtual void a() = 0;
};
class IB {
public:
    virtual void b() = 0;
};
class C : public IA, public IB {
public:
    virtual void a() override {
        std::cout << "a" << std::endl;
    }
    virtual void b() override {
        std::cout << "b" << std::endl;
    }
};
Debugger:
c   {...}   C
    IA  {...}   IA
        __vfptr 0x00157754 const C::`vftable'{for `IA'} *
            [0] 0x00151163 C::a(void)   *
    IB  {...}   IB
        __vfptr 0x00157748 const C::`vftable'{for `IB'} *
            [0] 0x0015121c C::b(void)   *
Multiple virtual table pointers are clearly visible, and sizeof(C) == 8 (in 32-bit build).
The...
C c;
std::cout << static_cast<IA*>(&c) << std::endl;
std::cout << static_cast<IB*>(&c) << std::endl;
..prints...
0027F778
0027F77C
...indicating that pointers to different interfaces within the same object actually point to different parts of that object (i.e. they contain different physical addresses).
I think the article Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects will answer your questions. In particular, see the section *Interface Vtable Map and Interface Map-, and the following section on Virtual Dispatch.
It's probably possible for the JIT compiler to figure things out and optimize the code for your simple case. But not in the general case.
IFoo f2 = GetAFoo();
And GetAFoo is defined as returning an IFoo, then the JIT compiler wouldn't be able to optimize the call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With