ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Analyze Your Classes

by Vikram Goyal
10/22/2003

Most of us never need to go beyond the basics of coding and compiling our classes. The Java Virtual Machine (JVM) is a highly efficient engine that executes our classes and for the most part, we are happy with the way it runs. However, to extend and enhance the JVM to improve runtime performance, among other things, we need to take a deeper look inside this engine and the structure of the class files that it loads and executes. The Byte Code Engineering Library (BCEL) from the Apache-Jakarta stable helps the average developers by analyzing and manipulating the structure of class files.

This article gives an introduction to this API. I will start with an introduction to the Java class file format, which is important to understand in order to manipulate it. I will follow it up with the basics of BCEL, the BCEL Application Programming Interface (API), and some examples of how to use it. Finally, I will round the article off with pointers to where you can get more information and help.

Java Class File Format

When the Java compiler compiles your source code, it creates a machine- and operating-system- independent byte code that gets stored in a class file. This file is binary and contains the instruction set and data of your class for the JVM to execute. This file accurately defines your class to the JVM, according to Java class file format. Basically this means that all class files have a predefined structure.

Each class file contains the definition of a single class or interface. This file, as I said earlier, consists of binary data represented as a stream of 8-bit bytes. This means that if you have a data type that is 16-bit, 32-bit, or 64-bit, it will be read in chunks of 2, 4, or 8 consecutive 8-bit bytes, respectively.

Each class file, to be a valid Java class file, must contain the following elements that completely describe your class and form the right Java class file structure. The list below enumerates these elements and their valid data types. Following this list is a description of each of these elements.

  1. Magic Number: Unsigned 4 bytes.
  2. Minor Version Number: Unsigned 2 bytes.
  3. Major Version Number: Unsigned 2 bytes.
  4. Constant Pool Count: Unsigned 2 bytes.
  5. Constant Pool: A table of structures.
  6. Access Flags: Unsigned 2 bytes.
  7. This Class: Unsigned 2 bytes.
  8. Super Class: Unsigned 2 bytes.
  9. Interfaces Count: Unsigned 2 bytes.
  10. Interfaces: An array of Unsigned 2 bytes.
  11. Fields Count: Unsigned 2 bytes.
  12. Fields: An array of type field_info, which in itself is a structure, described later.
  13. Methods Count: Unsigned 2 bytes.
  14. Methods: An array of type method_info, which in itself is a structure, described later.
  15. Attributes Count: Unsigned 2 bytes.
  16. Attributes: An array of type attribute_info, which in itself is a structure, described later.

These elements must be in the order specified and all of these elements must be present.

1. Magic Number

A magic number is an identifier that identifies each class file as a Java class file. A magic number is not just Java's way of identifying class files; a magic number is used to identify a file as a particular type by other types of file types as well, like GIF or JPEG, which have their own magic numbers. So what is a Java class file's magic number? Well, expressed in hex, it spells CAFE BABE! If a Java class file does not start with this number, the ClassLoader will throw the exception "Bad Magic Number".

2. Minor and Major Version Numbers

Together with the magic number, you can think of the Minor and Major version numbers as the header of the Java class file. The version numbers are not just for informational reasons, though. A JVM can run a class file only if the version number is within a specified range. This range is specified as:

Major.0 <= Version <= Major.Minor

For example, the version for the current JVM (1.2) can be anywhere between 45.0 and 45.6.

3. Constant Pool Count and Constant Pool

The Constant Pool represents the various String Constants, class and interface names, field names, and other Constants that are within the Java Class being represented. The Constant Pool Count is used to identify the number of entries in the Constant Pool, and equals the number of entries in the Constant Pool plus one. Each Constant is represented using a specialized data structure relevant to the type of the Constant. However, all of the Constants have a tag that identifies that structure and type of the Constant. This tag is an unsigned byte. Thus, each entry in the Constant Pool begins with this unsigned byte that allows the rest of the entry to be read accordingly. For example, if the first unsigned byte in the Constant Pool has a value of 8, it represents the corresponding Constant as of type CONSTANT_String_info. As you can imagine, the Constant Pool grows to be quite large, because not only does it contain the various String Constants, but also the symbolic references to class, interface, method, and field names which, at runtime, are resolved using String Constants and hence end up in the Constant Pool.

4. Access Flags

Bit mask flags that are used to define the various access rights of this file. These flags determine, or rather, inform, the JVM of the visibility and access rights of this class or interface. These flags include:

  • ACC_PUBLIC: This has a value of 0x0001 and means that this class or interface is public and may be accessed from outside of its package.

  • ACC_FINAL: This has a value of 0x0010 and means that this class or interface is final and may not be subclassed.

  • ACC_SUPER: This has a value of 0x0020 and exists for backward compatibility. All newer versions of JVM have this flag set.

  • ACC_INTERFACE: This has a value of 0x0200 and indicates that this file represents an interface and not a class. If this flag is not set, it means that this file is a class. If this flag is set, it means that the flag ACC_ABSTRACT must also be set and that ACC_FINAL must not be set.

  • ACC_ABSTRACT: This has a value of 0x0400 and indicates that this class is abstract and cannot be instantiated. Again, if this flag is set, ACC_FINAL must not be set.

5. This Class and Super Class

These represent a valid index to the Constant Pool table or, in the case of Super Class, a value of 0. This index points to the structure within the Constant Pool that is of the type CONSTANT_Class_info (unsigned 1-byte tag and unsigned 2-byte index within the Constant Pool for the name) and that represents the name of this class and the super class, respectively. If the index is 0, as might be in the case of a Super Class, then this class file must represent the class Object.

6. Interfaces Count and Interfaces

The Interfaces array contains indices to Constant Pool, where each entry is of the type CONSTANT_Class_info. The Interfaces Count specifies the count of the implemented interfaces.

7. Fields Count and Fields

The Fields array contains items of the type field_info, described later, which completely describe a field. The Fields Count specifies the number of items in this array. The fields represented are both class and instance variables, but not superclass-inherited fields. The field_info structure is of the type:

  • Access Flags: Similar to bit mask access flags defined earlier, but contain other flags to indicate VOLATILE or TRANSIENT fields.

  • Name Index: An index to the Constant Pool where its name is defined as a UTF8 String.

  • Descriptor Index: Again, an index to the Constant Pool where the field descriptor is defined as a UTF8 String.

  • Attributes Count and Attributes: Attributes represent extra information about a Field, like deprecated and constant fields.

8. Methods Count and Methods

Similar to Fields, the Methods array contains items of type method_info, described later, which completely describe a method. The Method Count specifies the number of items in the Methods array. As with Fields, no methods from superclass or superinterfaces are defined. If the method is native or abstract, the JVM instructions are not supplied.

  • Access Flags: Similar to bit mask access flags defined earlier, but contain other flags to indicate SYNCHRONIZED and NATIVE methods.

  • Name Index: An index to the Constant Pool where the method name is defined as a UTF8 String.

  • Descriptor Index: An index to the Constant Pool where the method is described as a UTF8 String.

  • Attributes Count and Attributes: Attributes represent extra information about a method, like exceptions and deprecated attributes.

9. Attributes Count and Attributes

As with Fields and Methods, these are a set of Class level attributes that represent extra information about a Class. The only attributes defined for Classes are the sourcefile attribute and the deprecated attribute.

Since the symbolic references to classes, fields, and methods is coded with String constants, the Constant Pool, which contains these String constants in the Java Class file, is the biggest portion of a class file. This is thus an easy target for APIs like the BCEL to manipulate and analyze.

Pages: 1, 2, 3

Next Pagearrow