I was reading this question and it's accepted answer. I read the comments but I couldn't figure out the reason for the optimization produced.
Why does branching occur in assembly code when using the following code?
x >= start && x <= end
EDIT:
For clarity, I want to understand the reason of optimization produced by the accepted answer. That as I understand is the branching present in the assembly code produced by the compiler. I want to understand why is there a branch in the produced code.
Note that the linked question has a fundamentally different expression
x >= start && x++ <= end
It is fundamentally different because the second sub-expression here has a side effect. I'll explain.
Note that && is a short-circuit operator. This means that if x >= start evaluates to false, the machine can branch over evaluation of x <= end.
More precisely, when the compiler emits instructions for x >= start && x <= end, it can emit instructions to branch over x <= end when x >= start evaluates to false.
However, I stress the use of the word can in the above statements. The reason for this is because x <= end has no side-effects and therefore it doesn't matter if the compiler branches over evaluation of it or not.
But, in the case that the second expression does have side effects the compiler must branch over it. Since && is a short-circuit operator, in a && b, if b has any side effects they must not be observed if a evaluates to false; this is a requirement of short-circuiting in C and most (if not all C-like languages).
So, in particular, when you look at
define POINT_IN_RANGE_AND_INCREMENT(p, range)
(p <= range.end && p++ >= range.start)
note that the second expression p++ >= range.start has a side effect. Namely, (post-)incrementing p by 1. But that side effect can only be observed if p <= range.end evaluates to true. Thus, the compiler must branch over evaluation of p++ >= range.start if p <= range.end evaluates to false.
The reason this results in a branch is because for machine to evaluate that expression, it uses the fact that if p <= range.end evaluates to false, then it automatically knows the entire expression evaluates to false and therefore it should not evaluate p++ >= range.start because it has a side-effect. Thus, it must branch over evaluating the second part of the expression. So in the assembly:
Ltmp1301:
ldr r1, [sp, #172] @ 4-byte Reload
ldr r1, [r1]
cmp r0, r1
bls LBB44_32
mov r6, r0 @ if the result of the compare is false
b LBB44_33 @ branch over evaluating the second expression
@ to avoid the side effects of p++
LBB44_32:
ldr r1, [sp, #188] @ 4-byte Reload
adds r6, r0, #1
Ltmp1302:
ldr r1, [r1]
cmp r0, r1
bhs LBB44_36
Deep indebtedness to Oli Charlesworth for insightful comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With