I have a cpp file containing only the following:
void f(int* const x)
{
  (*x)*= 2;
}
I compile with:
g++ -S -masm=intel -O3 -fno-exceptions -fno-asynchronous-unwind-tables f.cpp
This results in f.s containing:
    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 12
    .intel_syntax noprefix
    .globl  __Z1fPi
    .p2align    4, 0x90
__Z1fPi:                                ## @_Z1fPi
## BB#0:
    push    rbp
    mov rbp, rsp
    shl dword ptr [rdi]
    pop rbp
    ret
.subsections_via_symbols
If I remove the push, mov, and pop instructions and assemble (on a mac, I'm using Clang), the resulting object file is 4 bytes smaller. Linking and executing results in the same behaviour and the same sized executable.
This suggests that those instructions are superfluous - why does the compiler bother putting them in? Is this simply an optimization that is left to the linker?
CLANG/CLANG++ is both a native compiler and a cross compiler that supports multiple targets. On OS/X the targets by default are usually a variant of x86_64-apple-darwin for 64-bit code and i386-apple-darwin for 32-bit code. The code you are seeing that resembles this form:
push    rbp
mov rbp, rsp
[snip]
pop rbp
ret
Is produced to introduce stack frames. By default CLANG++ implicitly enables stack frames for the Apple Darwin targets. This differs from the Linux targets like x86_64-linux-gnu and i386-linux-gnu. Stack frames can come in handy for some profiling and unwind libraries and can aid debugging on the OS/X platforms which is why I believe they opt to turn them on by default.
You can explicitly omit frame pointers with CLANG++ using the option -fomit-frame-pointer. If you use the build command 
g++ -S -masm=intel -O3 -fno-exceptions -fno-asynchronous-unwind-tables \
    -fomit-frame-pointer f.cpp 
The output would be something similar to:
    shl     dword ptr [rdi]
    ret
If you use different targets with CLANG++ you'd discover the behavior is different. This is an x86-64 Linux target where we don't explicitly omit the frame pointer:
clang++ -target x86_64-linux-gnu -S -masm=intel -O3 -fno-exceptions \
    -fno-asynchronous-unwind-tables f.cpp 
Which generates:
    shl     dword ptr [rdi]
    ret
This is your original x86-64 Apple Darwin target:
clang++ -target x86_64-apple-darwin -S -masm=intel -O3 -fno-exceptions \
    -fno-asynchronous-unwind-tables f.cpp 
Which generates:
    push    rbp
    mov     rbp, rsp
    shl     dword ptr [rdi]
    pop     rbp
    ret
And then the x86-64 Apple target with frame pointers omitted:
clang++ -target x86_64-apple-darwin -S -masm=intel -O3 -fno-exceptions \
    -fno-asynchronous-unwind-tables -fomit-frame-pointer f.cpp 
Which generates:
    shl     dword ptr [rdi]
    ret
You can do a comparison of these targets on Godbolt. The first column of generated code is similar to the question - Apple target with implicit frame pointers. The second is Apple target without frame pointers and the third is an x86-64 Linux target.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With