I have the following Java code:
public int sign(int a) {
if(a<0) return -1;
else if (a>0) return 1;
else return 0;
}
which when compiled generated the following bytecode:
public int sign(int);
Code:
0: iload_1
1: ifge 6
4: iconst_m1
5: ireturn
6: iload_1
7: ifle 12
10: iconst_1
11: ireturn
12: iconst_0
13: ireturn
I want to know how the byte offset count (the first column) is calculated, in particular, why is the byte count for the ifge and ifle instructions 3 bytes when all the other instructions are single byte instructions?
As already pointed out in the comment: The ifge and ifle instructions have an additional offset.
The Java Virtual Machine Instruction Set specification for ifge and ifle contains the relevant hint here:
Format
if<cond> branchbyte1 branchbyte2
This indicates that there are two additional bytes associated with this instruction, namely the "branch bytes". These bytes are composed to a single short value to determine the offset - namely, how far the instruction pointer should "jump" when the condition is satisfied.
Edit:
The comments made me curious: The offset is defined to be a signed 16 bit value, limiting the jumps to the range of +/- 32k. This does not cover the whole range of a possible method, which may contain up to 65535 bytes according to the code_length in the class file.
So I created a test class, to see what happens. This class looks like this:
class FarJump
{
public static void main(String args[])
{
call(0, 1);
}
public static void call(int x, int y)
{
if (x < y)
{
y++;
y++;
... (10921 times) ...
y++;
y++;
}
System.out.println(y);
}
}
Each of the y++ lines will be translated into a iinc instruction, consisting of 3 bytes. So the resulting byte code is
public static void call(int, int);
Code:
0: iload_0
1: iload_1
2: if_icmpge 32768
5: iinc 1, 1
8: iinc 1, 1
...(10921 times) ...
32762: iinc 1, 1
32765: iinc 1, 1
32768: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
32771: iload_1
32772: invokevirtual #4 // Method java/io/PrintStream.println:(I)V
32775: return
One can see that it still uses an if_icmpge instruction, with an offset of 32768 (Edit: It is an absolute offset. The relative offset is 32766. Also see this question)
By adding a single more y++ in the original code, the compiled code suddenly changes to
public static void call(int, int);
Code:
0: iload_0
1: iload_1
2: if_icmplt 10
5: goto_w 32781
10: iinc 1, 1
13: iinc 1, 1
....
32770: iinc 1, 1
32773: iinc 1, 1
32776: goto_w 32781
32781: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
32784: iload_1
32785: invokevirtual #4 // Method java/io/PrintStream.println:(I)V
32788: return
So it reverses the condition from if_icmpge to if_icmplt, and handles the far jump with a goto_w instruction, that contains four branch bytes and can thus cover (more than) a full method range.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With