Index Previous Next

Tools

To start optimising bytecodes we need to have some tools. Fortunately there are a number of useful tools freely available on the Internet.

Firstly, to convert a Java class file into human-readable text we need a disassembler. The D-Java disassembler, by Shawn Silverman, is available at:

   http://www.cat.nyu.edu/meyer/jvm/djava/

It's written in C and has a number of options to control the format of the output. I compiled it using gcc 2.7.2 under Linux 2.0.33 without any problems, though there are a few bugs which you should fix first:

Secondly, we need an optimiser. I found copt, a generic peephole optimiser by Chris Fraser, at:

   ftp://ftp.cs.princeton.edu/pub/packages/lcc/contrib/copt.shar

This is a small part of a much larger package, a retargetable C compiler, described in the book 'A Retargetable C Compiler: Design and Implementation' (Addison-Wesley, 1995, ISBN 0-8053-1670-1). Check out this site for further details.

My own contribution to this enterprise is a set of rules which are used by the generic optimiser to rewrite the assembly language source file. These rules are contained in three small text files: rules, rules.s and rules.f. (Actually, rules.f is currently empty because I haven't found any suitable optimisations to put in it.)

There's also a small shell script, jopt which runs the optimiser and massages its output slightly. jopt takes two arguments, the input and output files, and can have one of two optional flags.

Finally, we need an assembler to turn our optimised source file back into a class file. My choice here is the Jasmin assembler, by Jon Meyer. Jasmin is written in Java and is described in the book 'Java Virtual Machine' (O'Reilly, 1996, ISBN 1-56592-194-1). The home page for Jasmin, from which it can be downloaded, is:

   http://mrl.nyu.edu/meyer/jvm/jasmin.html

The D-Java disassembler can output source code in the format required as input by Jasmin, so these tools make a good combination.

I had trouble with the wrapper script which runs the assembler. It uses /bin/csh, which was sometimes dumping core on my machine. I replaced the csh script with a Bourne shell equivalent.

Also, Jasmin didn't support the Unicode escape sequence in string literals so I added this method to Scanner.java.

    int readHex() throws java.io.IOException {
        char d[] = new char[4] ;
        int i ;

        advance();
        d[0] = (char)next_char;
        advance();
        d[1] = (char)next_char;
        advance();
        d[2] = (char)next_char;
        advance();
        d[3] = (char)next_char;
        try {
            i = Integer.parseInt(new String(d), 16) ;
        }
        catch (NumberFormatException nfe) {
            i = 0 ;
        }
        return i ;
    }

The readHex method is called from the switch statement which handles backslashes in quoted strings. I also took the opportunity to add support for escaped backslashes as well:

    case '\\': next_char = '\\'; break;
    case 'u':
        next_char = readHex() ;
        break;

Finally, it doesn't do any harm to increase the size of the array chars in the same file. If you intend to have a go at optimising the JDK classes.zip file you'll need to give it 35,000 elements so that it can handle some of the LocaleElements and ByteToChar classes.


Index Previous Next