Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why would the optimization ignore the if statement when function has warning [-Wreturn-type]?

OS: ubuntu2204

compiler: gcc 11.2 x86_64

here's a simple code:


#include <cstdlib>

int func(int val) {
    if (val == 1) {
    } else {
        abort();
    }
}

int main(int argc, char* argv[]) {
    func(argc);
}

when I compile it without any optimization, and run it, it works fine.

but when I compile it with g++ tmp.cpp -O3, it turns out that func would ignore the input value, and just call abort.

of course I can fix it by adding a return statement at the end of func, but still, why?

here's some output from objdump -d a.out of optimized function func:


0000000000001060 <_Z4funci>:
    1060:   f3 0f 1e fa             endbr64 
    1064:   50                      push   %rax
    1065:   58                      pop    %rax
    1066:   50                      push   %rax
    1067:   e8 e4 ff ff ff          call   1050 <abort@plt>
like image 766
陈泽霖 Avatar asked Oct 26 '25 18:10

陈泽霖


1 Answers

First of all, flowing off the end of a non-void function without a return statement is undefined behavior in C++. See Why does flowing off the end of a non-void function without returning a value not produce a compiler error? In general, compilers use undefined behavior for optimization. See also:

  • Benefit of endless-loops without side effects in C++ being undefined behavior compared to C?
  • Is undefined behavior worth it?
  • Does undefined behavior really help modern compilers to optimize generated code?

Intuitively, the compiler can see that there is undefined behavior in the if (val == 1) branch because you flow off the end of the function. The compiler can then say that this branch is unreachable and always call abort(), as if the condition was always false.

More concretely, LLVM starts out with the following IR:

entry:
  // ...
  br i1 %cmp, label %if.then, label %if.else

if.then:
  br label %if.end
if.else:
  call void @abort()
  unreachable
if.end:
  unreachable

Depending on the value of val, control flow branches either into if.then (which immediately goes to if.end), or to if.else (which calls a [[noreturn]] function).

In a SimplifyCFG (Simplify Control Flow Graph) pass, this structure is simplified to:

  %1 = xor i1 %cmp, true
  call void @llvm.assume(i1 %1)
  call void @abort()
  unreachable

The compiler now assumes that val ^ 1 is always true, which means it val must be false. There is no more branch, but rather, control flows directly into abort(). Intuitively, if statements are altered to prevent reaching an unreachable instruction. Concretely, a br which jumps to a basic block containing unreachable is eliminated.

After further optimization passes, the whole function essentially becomes:

  tail call void @abort()
  unreachable

See the whole optimization pipeline at https://godbolt.org/z/bnvh7KrG4

On a debug build, with no optimizations, the compiler doesn't simplify the if statement. The code works "as intended". Both GCC and clang output something like:

        je      .L2
        call    abort
.L2:
        ud2

Flowing off the end of the function becomes a ud2 pseudo-instruction (generated from unreachable), and executing it would immediately halt the program.

This answer is specific to clang and you're using GCC, however, the optimizations that these compilers perform are relatively similar, especially in trivial cases like these.

like image 159
Jan Schultke Avatar answered Oct 28 '25 07:10

Jan Schultke