It's a variation of code from this tweet, just shorter one and not causing any damage to noobs. We have this code:
typedef int (*Function)();
static Function DoSmth;
static int Return7()
{
return 7;
}
void NeverCalled()
{
DoSmth = Return7;
}
int main()
{
return DoSmth();
}
You see that NeverCalled() is never called in the code, don't you? Here's what Compiler Explorer shows when clang 3.8 is selected with
-Os -std=c++11 -Wall
Code emitted is:
NeverCalled():
retq
main:
movl $7, %eax
retq
as if NeverCalled() was actually called before DoSmth() and set the DoSmth function pointer to Return7() function.
If function pointer assignment is removed from inside NeverCalled() as in here:
void NeverCalled() {}
then code being emitted is this:
NeverCalled():
retq
main:
ud2
The latter is quite expected. The compiler knows that function pointer is surely null and calling function using a null function pointer is undefined behavior.
The former code is not really expected. Somehow the compiler decided to have Return7() called although it's not directly called anywhere and function pointer assignment is inside function that is not called.
Yes, I know the compiler facing code with undefined behavior is allowed to do this by C++ Standard. Just how does it do this?
How does clang happen to emit this specific machine code?
NeverCalled is a misnomer. Any global function is potentially called (by a constructor of a global object in a different translation unit, for example).
Incidentally, this is the only way this TU can possibly be incorporated in a program that doesn't have UB. In this case, main returns 7.
Make NeverCalled static, and main will compile to empty code.
The path by which clang does this is probably something along the lines of;
DoSmth is a static, so is zero initialised. Since it is a pointer (to function) that has the effect of initialisation to the NULL pointer (or nullptr)main() does return DoSmth() so clang then reasons that DoSmth cannot be NULL, since that would cause return DoSmth() to exhibit undefined behaviour;DoSmth = Return7 in NeverCalled();DoSmth to be non-NULL, and it has reasoned that DoSmth is not NULL, clang assumes NeverCalled() must have been called somehow;DoSmth must be equal to the address of Return7;DoSmth == Return7, clang converts the return DoSmth() into return Return7();Return7() is in the same compilation unit, so clang inlines it.The specifics of how clang does this internally is anyone's guess. However, various steps of code optimisation probably result in a reasoning chain something like the above.
The point is that your code - as it stands - has undefined behaviour. One cute feature of undefined behaviour is that a compiler is permitted (as distinct from required) to reason that your code actually has well-defined behaviour. In turn, that permits the compiler to reason that some code which ensures the behaviour to be well-defined has been magically executed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With