Most of us never need to go beyond the basics of coding and compiling our classes. The Java Virtual Machine (JVM) is a highly efficient engine that executes our classes and for the most part, we are happy with the way it runs. However, to extend and enhance the JVM to improve runtime performance, among other things, we need to take a deeper look inside this engine and the structure of the class files that it loads and executes. The Byte Code Engineering Library (BCEL) from the Apache-Jakarta stable helps the average developers by analyzing and manipulating the structure of class files.
This article gives an introduction to this API. I will start with an introduction to the Java class file format, which is important to understand in order to manipulate it. I will follow it up with the basics of BCEL, the BCEL Application Programming Interface (API), and some examples of how to use it. Finally, I will round the article off with pointers to where you can get more information and help.
When the Java compiler compiles your source code, it creates a machine- and operating-system- independent byte code that gets stored in a class file. This file is binary and contains the instruction set and data of your class for the JVM to execute. This file accurately defines your class to the JVM, according to Java class file format. Basically this means that all class files have a predefined structure.
Each class file contains the definition of a single class or interface. This file, as I said earlier, consists of binary data represented as a stream of 8-bit bytes. This means that if you have a data type that is 16-bit, 32-bit, or 64-bit, it will be read in chunks of 2, 4, or 8 consecutive 8-bit bytes, respectively.
Each class file, to be a valid Java class file, must contain the following elements that completely describe your class and form the right Java class file structure. The list below enumerates these elements and their valid data types. Following this list is a description of each of these elements.
field_info, which in itself is a structure, described later.method_info, which in itself is a structure, described later.attribute_info, which in itself is a structure, described later.These elements must be in the order specified and all of these elements must be present.
A magic number is an identifier that identifies each class file as a Java class file. A magic number is not just Java's way of identifying class files; a magic number is used to identify a file as a particular
type by other types of file types as well, like GIF or JPEG, which have their
own magic numbers. So what is a Java class file's magic number? Well, expressed
in hex, it spells CAFE BABE! If a Java class file does not start with this
number, the ClassLoader will throw the exception "Bad Magic Number".
Together with the magic number, you can think of the Minor and Major version numbers as the header of the Java class file. The version numbers are not just for informational reasons, though. A JVM can run a class file only if the version number is within a specified range. This range is specified as:
Major.0 <= Version <= Major.Minor
For example, the version for the current JVM (1.2) can be anywhere between 45.0 and 45.6.
The Constant Pool represents the various String Constants, class and interface names, field names, and other
Constants that are within the Java Class being represented. The Constant Pool
Count is used to identify the number of entries in the Constant Pool, and equals
the number of entries in the Constant Pool plus one. Each Constant is represented
using a specialized data structure relevant to the type of the Constant. However,
all of the Constants have a tag that identifies that structure and type of the
Constant. This tag is an unsigned byte. Thus, each entry in the Constant Pool
begins with this unsigned byte that allows the rest of the entry to be read
accordingly. For example, if the first unsigned byte in the Constant Pool has
a value of 8, it represents the corresponding Constant as of type
CONSTANT_String_info. As you can imagine, the Constant Pool grows to be quite large, because not only
does it contain the various String Constants, but also the symbolic references
to class, interface, method, and field names which, at runtime, are resolved
using String Constants and hence end up in the Constant Pool.
Bit mask flags that are used to define the various access rights of this file. These flags determine, or rather, inform, the JVM of the visibility and access rights of this class or interface. These flags include:
ACC_PUBLIC: This has a value of 0x0001 and means that this class or interface is public and may be accessed from outside of its package.
ACC_FINAL: This has a value of 0x0010 and means that this class or interface is final and may not be subclassed.
ACC_SUPER: This has a value of 0x0020 and exists for backward compatibility. All newer versions of JVM have this flag set.
ACC_INTERFACE: This has a value of 0x0200 and indicates that this file represents an interface and not a class. If this flag is not set, it means that this file is a class. If this flag is set, it means that the flag ACC_ABSTRACT must also be set and that ACC_FINAL must not be set.
ACC_ABSTRACT: This has a value of 0x0400 and indicates that this class is abstract and cannot be instantiated. Again, if this flag is set, ACC_FINAL must not be set.
These represent a valid index to the Constant Pool table or, in the case of Super Class, a value of 0. This index
points to the structure within the Constant Pool that is of the type CONSTANT_Class_info
(unsigned 1-byte tag and unsigned 2-byte index within the Constant Pool for
the name) and that represents the name of this class and the super class,
respectively. If the index is 0, as might be in the case of a Super Class,
then this class file must represent the class Object.
The Interfaces array contains indices to Constant Pool, where each entry is of the type CONSTANT_Class_info. The Interfaces Count specifies the count of the implemented interfaces.
The Fields array contains items of the type field_info, described later, which completely describe a field. The Fields Count specifies the number of items in this array. The fields represented
are both class and instance variables, but not superclass-inherited fields.
The field_info structure is of the type:
Access Flags: Similar to bit mask access flags defined earlier, but contain other flags to indicate VOLATILE or TRANSIENT fields.
Name Index: An index to the Constant Pool where its name is defined as a UTF8 String.
Descriptor Index: Again, an index to the Constant Pool where the field descriptor is defined as a UTF8 String.
Attributes Count and Attributes: Attributes represent extra information about a Field, like deprecated and constant fields.
Similar to Fields, the Methods array contains items of type method_info, described later, which completely describe a method. The Method Count specifies the number of items in the Methods array. As with
Fields, no methods from superclass or superinterfaces are defined. If the
method is native or abstract, the JVM instructions are not supplied.
Access Flags: Similar to bit mask access flags defined earlier, but contain other flags to indicate SYNCHRONIZED and NATIVE methods.
Name Index: An index to the Constant Pool where the method name is defined as a UTF8 String.
Descriptor Index: An index to the Constant Pool where the method is described as a UTF8 String.
Attributes Count and Attributes: Attributes represent extra information about a method, like exceptions and deprecated attributes.
As with Fields and Methods, these are a set of Class level attributes that represent extra information about a Class. The only attributes defined for Classes are the sourcefile attribute and the deprecated attribute.
Since the symbolic references to classes, fields, and methods is coded with String constants, the Constant Pool, which contains these String constants in the Java Class file, is the biggest portion of a class file. This is thus an easy target for APIs like the BCEL to manipulate and analyze.
|
BCEL was formerly known as JavaClass. It was incorporated as an Apache Jakarta project in October 2001. The original JavaClass was written by Markus Dahm. The main site is hosted at jakarta.apache.org, from which you can access binaries and source code.
At the heart of BCEL is the JavaClass. A JavaClass represents a Java Class
file as described above, with all of the elements. There is a one-to-one mapping
between the elements of a Java class as described in the JVM specification and
the JavaClass. BCEL thus allows you to read a normal class file in your program
and treat it like any other object. The properties of this object are the Java
class file elements. Furthermore, a JavaClass, which has been created on the
fly within your program, represents an actual class file. If serialized, you
will be able to run this class file in a JVM, as you would do a normally compiled
source file.
BCEL allows you, at a micro level, to model the instructions contained within the Java class file. This way, you can navigate and manipulate this instruction set programmatically, allowing you to introduce enhancements and improvements in the runtime of your class. However, this is not the only way to introduce such enhancements. Better compilers and source code optimization can also do the same trick. Furthermore, it is easier to manipulate source code than it is to manipulate raw bytes. Having said all this, direct byte code manipulation has the advantage of being faster than any enhancement that you can do via compiler or source-code manipulation. This comes at the price of extra complexity. BCEL alleviates this complexity to a certain degree by allowing you to manipulate class files via source code.
Another feature of BCEL is what is called as load-time reflection. As opposed to run-time reflection, which is implemented by using the Reflection API built into the Java language, load-time reflection refers to the ability to modify the byte code instruction set at the time it is loaded. This involves writing a custom classloader, which instead of passing the byte code directly to the JVM, first passes it through your runtime system written using the BCEL API. Your system can then access meta-level objects created at load time and manipulate them. This process can even create these objects without source code present. The result continues normally after this where it is passed to the byte code verifier and then executed in the JVM.
The BCEL API is roughly divided into two parts:
Static API
This is the part of the API that deals with mapping the data
structures and binary components described in the JVM specification. You would
use this part if you were analyzing existing classes without access to the
source code.
The main class in this part is called JavaClass, which represents
a Java class and includes all the data structures, constant pool, fields,
methods, and commands contained in a class file. It supports the Visitor
design pattern, which allows developers to write their own visitor code
to traverse and analyze the contents of a class file. The JavaClass
itself derives from AccessFlags class, which is the class that
is extended by all classes that have access flags. This thus applies to
not only JavaClass, but also to the FieldOrMethod class,
the super class for Field and Method as well.
The Constant Pool is represented by the ConstantPool class.
It contains an array of type Constant that represents the
different constant types in the constant pool of a parsed class file. For
example, it may contain ConstantInteger, which represents reference
to an int object. You can access the constants using an index and by calling
the method getConstant(int index). Note that this internal
array may contain a null reference. This happens in the case of
double or long references that,
per the JVM specification, require a skip after an entry.
Another interesting class in the API is the Repository class.
This class is used to read existing class files into the system and for
resolving class interdependencies.
Generic API
This part of the API deals with creating or transforming class
files dynamically. It allows you to create a class file from scratch or read
an existing class file and dynamically modify it.
The central class in this API is the ClassGen class. This
class allows you to create a new class file and to add methods, attributes
and fields to it dynamically. You can load an existing class file in by
passing in a JavaClass that represents a file loaded into
memory as described in the Static API. This class also contains methods
to search this class for particular methods and fields, to replace existing
methods and fields, and remove existing methods and fields. You can also directly
use the MethodsGen and the FieldsGen classes for
generating methods and fields, respectively.
Corresponding to the ClassGen class is the ConstantPoolGen
class. This class allows you to add different types of constants and retrieve
the ConstantPool once you are done adding the constants by calling
getFinalConstantPool(). Constants are added using methods like addString(String str),
addInteger(int n), etc. These methods return the index at which
the constant was added. If you are not done adding constants to the pool
and yet want to access the ConstantPool in the state that it is in, you can
call getConstantPool(). This class also allows you to look up
existing entries in the pool with corresponding lookupString(String
str) and lookupInteger(int n) methods.
The BCEL API contains a stack of utility classes that allow you to get started
with the API without worrying too much about the semantics or getting involved
in the complexities. These include Class2HTML, a utility to transform class
files into HTML, JavaWrapper, a utility that acts as a wrapper to modify and
generate classes as they are requested using its own class loader, and BCELifier,
which takes a JavaClass object and generates BCEL Java source code to build
that class.
The following examples start with the utility classes and follow up with simple examples of using the static and dynamic API.
|
Source Code Download the source code for the examples. |
Important: Make sure that before you run these examples, you have set the CLASSPATH to include bcel.jar.
The easiest way to start with BCEL is to use the BCELifier, because
it generates the BCEL source code used to generate the class file itself and
is a very handy way of learning how BCEL works. It generates the code that
you would have to write yourself if you were to write the BCEL code for generating
the class file dynamically.
The following is a simple HelloWorld source file that I will use for this
example.
public class HelloWorld{
public static void main(String args[]){
System.err.println("Hello World through BCEL!");
}
}
Compile this class and produce the HelloWorld.class file.
Next, run the following command (all on one line) in the directory where you compiled HelloWorld.class:
java org.apache.bcel.util.BCELifier
HelloWorld.class >> HelloWorldCreator.java
Because the BCELifier class outputs the result to standard out, I have
piped the output to the resulting source file. Note that BCELifier creates
the source file as "ClassFileName" + "Creator". Hence, the BCELified
HelloWorld.class gets named HelloWorldCreator.java.
Compile and run HelloWorldCreator.java. You will see the output on the console as: "Hello World through BCEL!".
Open HelloWorldCreator.java and examine it. You will see that creating such a simple class is quite a complex process, even through BCEL abstracts most of the functionality of the Java class file.
|
This is the utility that traverses a class file and creates five HTML files. These HTML files completely describe the class file by dividing it into constant pool, attributes, byte code, and methods. The fifth file combines all of these into one easy-to-use HTML file.
Run Class2HTML on HelloWorld.class as shown below:
java org.apache.bcel.util.Class2HTML HelloWorld.class
This will create five files in the current directory. Open HelloWorld.html in your browser to see the class contents as illustrated in Figure 1:

Figure 1. HelloWorld.html, as generated by Class2HTML
I have marked the frames into the corresponding class file elements. This is a quick and easy way to map out a class file with the Class2HTML utility.
Using the static API, let's implement a simple class viewer. This is a simple example that exercises the JavaClass class and is similar to how Class2HTML operates.
import org.apache.bcel.Repository;
import org.apache.bcel.classfile.Code;
import org.apache.bcel.classfile.Method;
import org.apache.bcel.classfile.JavaClass;
public class ClassViewer{
private JavaClass clazz;
public ClassViewer(String clazz){
this.clazz = Repository.lookupClass(clazz);
}
public static void main(String args[]){
if(args.length != 1)
throw new IllegalArgumentException(
"One and only one class at a time!");
ClassViewer viewer = new ClassViewer(args[0]);
viewer.start();
}
private void start(){
if(this.clazz != null){
// first print the structure
// of the class file
System.err.println(clazz);
// next print the methods
Method[] methods = clazz.getMethods();
for(int i=0; i<methods.length; i++){
System.err.println(methods[i]);
// now print the actual
// byte code for each method
Code code = methods[i].getCode();
if(code != null)
System.err.println(code);
}
}else
throw new RuntimeException(
"Class file is null!");
}
}
The first thing that this example does is to look up the class that is to
be mapped by requesting that the Repository load it. This is done
by the Repository.lookupClass(String classname) method. The repository loads this class as a JavaClass
that contains all of the information in the Java Class file format. From then on, it is a simple matter of printing the class file structure using the toString conversion on the JavaClass file and the methods and code.
We have already seen the code required to create a dynamic class when we visited the BCELifier example. HelloWorldCreator.java creates a dynamic class on the fly. Let us see how we can modify this class dynamically by adding a new method.
import org.apache.bcel.*;
import org.apache.bcel.generic.*;
import org.apache.bcel.classfile.*;
public class ClassModifier implements Constants{
private JavaClass clazz;
private ClassGen classGen;
private ConstantPoolGen cp;
public ClassModifier(String clazz){
this.clazz = Repository.lookupClass(clazz);
this.classGen = new ClassGen(this.clazz);
this.cp = this.classGen.getConstantPool();
}
public static void main(String args[]){
if(args.length != 1)
throw new IllegalArgumentException(
"One and only one class at a time!");
ClassModifier modifier = new ClassModifier(args[0]);
modifier.start();
}
private void start(){
if(this.clazz != null) {
// print the methods BEFORE adding the new one
Method[] methods =
classGen.getJavaClass().getMethods();
System.err.println(
"++++ Before adding new method ++++");
for(int i=0; i<methods.length; i++){
System.err.println(methods[i]);
}
InstructionList il = new InstructionList();
classGen.addMethod(
new MethodGen (ACC_PUBLIC | ACC_STATIC,
Type.VOID,
Type.NO_ARGS,
new String[] { },
"newMethod",
clazz.getClassName(),
IL,
cp).getMethod());
// print the methods AFTER adding the new one
methods = classGen.getJavaClass().getMethods();
System.err.println(
"\n++++ After adding new method ++++");
for(int i=0; i<methods.length; i++){
System.err.println(methods[i]);
}
} else
throw new RuntimeException("Class file is null!");
}
}
This class loads a class file and represents it in the memory using
JavaClass. This part is similar to what I did in the static ClassViewer example. Having
created this representation in memory of the input class, the code above
creates an instance of the ClassGen class, using this representation as
the base:
this.classGen = new ClassGen(this.clazz);
this.cp = this.classGen.getConstantPool();
A new method is then added to this instance of classGen by using the
MethodGen constructor. As you can see, this new method has the access flags of public
and static and is called newMethod. The rest of the code simply
prints a list of methods in the class before and after the method is added.
Run this code using HelloWorld.class as the input class file. You will see the following output:
++++ Before adding new method ++++
public void <init>()
public static void main(String[] arg0)
++++ After adding new method ++++
public void <init>()
public static void main(String[] arg0)
public static void newMethod()
As you can see, the newMethod is added to our class file dynamically without
having to touch the source code.
This has been a superficial treatment of the dynamic part of the API. The idea is to improve performance and add enhancements by being able to dynamically modify class files and not just add trivial methods.
Vikram Goyal is the author of Pro Java ME MMAPI.
Return to ONJava.com.
Copyright © 2007 O'Reilly Media, Inc.