ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

Introduction to the ASM 2.0 Bytecode Framework
Pages: 1, 2, 3, 4, 5, 6, 7, 8

Here are a few things to notice:

  • All descriptors, string literals, and any other constants used in class structures are stored in a Constant Stack at the beginning of the class file and then referenced from all other structures by its indexes.
  • Each class must contain headers (including class name, super class, interfaces, etc.) and the Constant Stack. Other elements, such as the list of fields, list of methods, and all attributes, are optional and may or may not be present.
  • Each Field section includes field info such as name, access flags (public, private, etc.), descriptor and field Attributes.
  • Each Method section contains similar header info and information about max stack and max local variable numbers, which are used to verify bytecode. For non-abstract and non-native methods, there is a table of method instructions (method body), an exceptions table, and code attributes. Besides these, there can be other method attributes.
  • Each Attribute for Class, Field, Method, and Method Code has its own name, which is also documented in the Class File format section of the JVM specification. These attributes represent various pieces of information about bytecode, such as source file name, inner classes, signature (used to store generics info), line number, and local variable tables and annotations. The JVM specification also allows the definition of custom attributes that will be ignored by the standard VM, but may contain additional information. Note that Java 5 annotations practically made these custom attributes obsolete, because annotation semantics allow you to express pretty much anything.
  • The Method Code table contains a list of instructions for the Java Virtual Machine. Some of these instructions (as well as the exception, line number, and local variable tables) use offsets within the code table and the values of all of these offsets may need to be adjusted when instructions are inserted or removed from the method code table.

As you can see, bytecode tweaking isn't easy. However, the ASM framework reduces the complexity of the underlying structures and provides a simplified API that still allows for access to all bytecode information and enables complex transformations.

Event-Based Bytecode Processing

The Core package uses a push approach (similar to the "Visitor" design pattern, which is also used in the SAX API for XML processing) to walk trough complex bytecode structures. ASM defines several interfaces, such as ClassVisitor (section [1] in the class file format diagram above), FieldVisitor (section [2]), MethodVisitor (section [3]), and AnnotationVisitor. AnnotationVisitor is a special interface that allows you to express hierarchical annotation structures. The next few paragraphs will show how these interfaces interact with each other and how they can be used together to implement bytecode transformations and/or capture information from the bytecode.

The Core package can be logically divided into two major parts:

  • Bytecode producers, such as a ClassReader or a custom class that can fire the proper sequence of calls to the methods of the above visitor classes.
  • Bytecode consumers, such as writers (ClassWriter, FieldWriter, MethodWriter, and AnnotationWriter), adapters (ClassAdapter and MethodAdapter), or any other classes implementing the above visitor interfaces.

Figure 2 shows the sequence diagram for the common producer-consumer interaction.

Sequence diagram for producer-consumer interaction
Figure 2. Sequence diagram for producer-consumer interaction

In this interaction, a client application creates ClassReader and calls the accept() method, passing a concrete ClassVisitor instance as a parameter. Then ClassReader parses the class and fires "visit" events to ClassVisitor for each bytecode fragment. For repeated contexts, such as fields, methods, or annotations, a ClassVisitor may create child visitors derived from the corresponding interface (FieldVisitor, MethodVisitor, or AnnotationVisitor) and return them to the producer. When a producer receive a null value for FieldVisitor or MethodVisitor, it skips that fragment of the class (e.g., a ClassReader wouldn't even parse the corresponding bytecode section in such a case, which leads to a sort of "lazy loading" feature driven by the visitors). Otherwise, the corresponding subcontext events are delegated to the child visitor instance. At the end of each subcontext, the producer calls the visitEnd() method and then moves on to the next section (e.g., the next field, method, etc.).

Pages: 1, 2, 3, 4, 5, 6, 7, 8

Next Pagearrow