Even thought it's not its main purpose, I've always thought that the final keyword (in some situations and VM implementations) could help the JIT. 
It might be an urban legend but I've never imagined that setting a field final could negatively affect the performances. 
Until I ran into some code like that:
   private static final int THRESHOLD = 10_000_000;
   private static int [] myArray = new int [THRESHOLD];
   public static void main(String... args) {
      final long begin = System.currentTimeMillis();
      //Playing with myArray
      int index1,index2;
      for(index1 = THRESHOLD - 1; index1 > 1; index1--)
          myArray[index1] = 42;             //Array initial data
      for(index1 = THRESHOLD - 1; index1 > 1; index1--) {
                                            //Filling the array
          for(index2 = index1 << 1; index2 < THRESHOLD; index2 += index1)
              myArray[index2] += 32;
      }
      long result = 0;
      for(index1 = THRESHOLD - 1; index1 > 1; index1-=100)
          result += myArray[index1];
      //Stop playing, let's see how long it took
      System.out.println(result);
      System.out.println((System.currentTimeMillis())-begin+"ms");
   }
Let's have a look at:
private static int [] myArray = new int [THRESHOLD];
Under W7 64-bit and on a basis of 10 successive runs, I get the following results:
THRESHOLD = 10^7, 1.7.0u09 client VM (Oracle):
myArray is not final.myArray is final.THRESHOLD = 3x10^7, 1.7.0u09 client VM (Oracle):
myArray is not final.myArray is final.THRESHOLD = 3x10^7, 1.7.0u01 client VM (Oracle):
myArray is not final.myArray is final. That's more than 15% difference !Remark: I used the bytecode produced by JDK 1.7.0u09's javac for all my tests. The bytecode produced is exactly the same for both versions except for myArray declaration, that was expected.
So why is the version with a static final myArray slower than the one with static myArray ?
EDIT (using Aubin's version of my snippet):
It appears that the differences between the version with final keyword and the one without only lies in the first iteration. Somehow, the version with final is always slower than its counterpart without on the first iteration, then next iterations have similar timings.
For example, with THRESHOLD = 10^8 and running with 1.7.0u09 client the first computation takes approx 35s while the second 'only' takes 30s.
Obviously the VM performed an optimization, was that the JIT in action and why didn't it kick earlier (for example by compiling the second level of the nested loop, this part was the hotspot) ?
Note that my remarks are still valid with 1.7.0u01 client VM. With that very version (and maybe earlier releases), the code with final myArray runs slower than the one without this keyword: 2671ms vs 2331ms on a basis of 200 iterations.
IMHO, the time of the System.out.println( result ) should not be added because I/O are highly variables and time consuming.
I think the factor of println() influence is bigger, really bigger than final influence.
I propose to write the performance test as follow:
public class Perf {
   private static final int   THRESHOLD = 10_000_000;
   private static final int[] myArray   = new int[THRESHOLD];
   private static /* */ long  min = Integer.MAX_VALUE;
   private static /* */ long  max = 0L;
   private static /* */ long  sum = 0L;
   private static void perf( int iteration ) {
      final long begin = System.currentTimeMillis();
      int index1, index2;
      for( index1 = THRESHOLD - 1; index1 > 1; index1-- ) {
         myArray[ index1 ] = 42;
      }
      for( index1 = THRESHOLD - 1; index1 > 1; index1-- ) {
         for( index2 = index1 << 1; index2 < THRESHOLD; index2 += index1 ) {
            myArray[ index2 ] += 32;
         }
      }
      long result = 0;
      for( index1 = THRESHOLD - 1; index1 > 1; index1 -= 100 ) {
         result += myArray[ index1 ];
      }
      if( iteration > 0 ) {
         long delta = System.currentTimeMillis() - begin;
         sum += delta;
         min = Math.min(  min,  delta );
         max = Math.max(  max,  delta );
         System.out.println( iteration + ": " + result );
      }
   }
   public static void main( String[] args ) {
      for( int iteration = 0; iteration < 1000; ++iteration ) {
         perf( iteration );
      }
      long average = sum / 999;// the first is ignored
      System.out.println( "Min    : " + min     + " ms" );
      System.out.println( "Average: " + average + " ms" );
      System.out.println( "Max    : " + max     + " ms" );
   }
}
And the results of only 10 iterations are:
With final:
Min    : 7645 ms
Average: 7659 ms
Max    : 7926 ms
Without final:
Min    : 7629 ms
Average: 7780 ms
Max    : 7957 ms
I suggest that readers run this test and post their results to compare.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With