Index Previous Next

Pushing duplicate variables to stack

Removing duplicate pushes to the stack need not be limited to constants. If we find instances where the same variable is pushed to the stack we can eliminate them too. Consider the following code, which uses an instance variable as a method argument:

   public class test {
       int k = 0 ;

       public void f(int i, int j) {
       }

       public void g() {
           f(k, k);
       }
   }

Compiling and disassembling this we find that the method call has been rendered as:

   aload_0
   aload_0
   getfield test/k I
   aload_0
   getfield test/k I
   invokevirtual test/f(II)V

The peephole optimiser can detect the duplicated aload/getfield pair and replace the second occurrence with a dup. Since the aload_0 is a one byte instruction and getfield takes three bytes, this results in a net saving of three bytes.

In this case the object owning the instance variable is 'this', which is available to the method f() as its first local variable. Hence the aload_0 instruction. Only the first four local variables can be accessed using this efficient form of aload. If the reference to the object owning the instance variable were not in one of the first four local variables a two byte aload would need to be used. In extreme cases, with more than 256 local variables, there is a three byte aload.

For example:

   public class test {
       int k = 0 ;

       public void f(int i, int j) {
       }

       public void g() {
           test a, b, c, d ;

           a = b = c = d = new test() ;
           f(d.k, d.k);
       }
   }

In this case the reference to the object owning the instance variable being pushed on the stack is in the fourth local variable, resulting in the slightly different assembly language code for the method call:

   aload_0
   aload 4
   getfield test/k I
   aload 4
   getfield test/k I
   invokevirtual test/f(II)V

All of the possible cases can be handled by one optimisation rule:

       aload%1
       getfield %2 I
       aload%1
       getfield %2 I
   =
       aload%1
       getfield %2 I
       dup

However, the examples above only used integer variables, where the last operand of getfield is I. There are a further seven case to handle: byte, character, float, object, short, boolean and array variables. In these cases the second operands are B, C, F, L...;, S, Z, and [... respectively.

Moreover, there are two additional types, long and double (operands J and D), which require slightly different treatment. Since longs and doubles consume two words on the stack we have to use the dup2 instruction to replace the second aload/getfield pair.

Once again I have tested the results of the optimisations by measuring the time taken to perform a million calls to the method g(). The time for the same number of method calls without any arguments has been subtracted.

       Variable          Bytes    Unoptimised       Optimised
                         saved     time (ms)        time (ms)

       int  (aload_0)      3         1272              812 
       int  (aload 15)     4         1380              871 
       long (aload_0)      3         1349              842 
       long (aload 15)     4         1973              921 

Clearly, in all cases there are savings in the number of bytes and in execution time.


Index Previous Next