Index Previous Next

Pushing duplicate variables to stack (2)

The examples on the previous page related to instance variables. The same techniques can be applied to class variables and local variables. Here's some sample code:

   public class test {
       static int k = 0 ;

       public void f(int i, int j) {
       }

       public void g() {
           int k1, k2, k3, k4 ;

           k1 = k2 = k3 = k4 = 0 ;
           f(test.k, test.k);
           f(k4, k4);
       }
   }

The bytecode generated for the method calls is:

       aload_0
       getstatic test/k I
       getstatic test/k I
       invokevirtual test/f(II)V
       aload_0
       iload 4
       iload 4
       invokevirtual test/f(II)V

As before we can replace the second instance of a variable load with a dup. Notice that I used the local variable k4 rather than one of the other ones. k1, for example, would result in the iload_1 instruction being used, and no saving in the number of bytes generated is possible. Indeed, if the iload_1 is replaced by a dup the execution time is increased slightly.

As with instance variables there are a number of other cases to consider:

Here are the results of some timing tests:

       Instruction       Bytes    Unoptimised       Optimised
                         saved     time (ms)        time (ms)

       getstatic I         2          521              459 
       getstatic F         2          733              462 
       getstatic D         2          573              796 
       getstatic J         2          584              782 

       iload_1             0          368              405 
       iload 4             1          437              415 
       fload 18            1          638              413 
       aload 10            1          430              413 
       lload 24            1          669              641 
       dload 22            1          668              656 

I haven't reported the times for static byte, char, short, boolean, object reference or array reference variables, as they're identical to those for integers.

The interesting thing to note about getstatic is that replacing a duplicate fetch of a long or double variable with a dup2 actually results in reduced performance. The optimisation rule for these cases has been placed in the rules.s file, so they're only invoked if the -s flag to jopt is used. (The -s flag optimises for code size, possibly at the expense of performance.)

As discussed above, the efficient form of the iload instruction can't be improved upon by replacing it with a dup. For all types, though, the less efficient two byte form gives a slight increase in performance.

Also of interest is the fact that the performance improvement for float variables is considerably larger than for other types.


Index Previous Next