ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

Code-Generation Techniques for Java
Pages: 1, 2

Code-Driven Approach: XDoclet

The most popular code generator for Java is XDoclet, and for good reason. It's easy and pragmatic, and it fits a need. XDoclet builds database-persistence beans to match the requirements specified in special JavaDoc comments within the Java entity bean code. We call this the "code driven approach" because it uses source code as the design input source.

Given a single entity bean with some markup, XDoclet will create the session beans, interfaces, and data access object required to complete the functional set. It's a pretty sweet deal for someone looking to get some work done quickly without having to go to the effort required by other code generation solutions. Figure 3 illustrates how XDoclet relates to the application stack:

Figure 3. XDoclet and the application stack

XDoclet has grown beyond just bean generation. It now acts as a generic code-generation platform for solutions that use JavaDoc markup as a source for design information. There are XDoclet modules for all types of outputs and you can easily create your own.

XDoclet's only drawback is its level of abstraction. Because the design is described in JavaDoc tags embedded in the code, code and design are bound tightly together with implementation specifics. Given this binding. it would be difficult to use XDoclet markup to generate complete code in a different language (e.g., C#).

VDoclet is an XDoclet clone that uses Velocity as the template language.

Model-Driven Approach: Custom

The alternative to the code-driven approach is to build code from an abstract model of the design. This model-driven approach comes in two flavors: MDA and custom. We will start with the custom approach and then get into MDA.

Using tools like XSLT, Velocity, and Jostraca, we can build textual output from an input specification. We can use these tools to build code by specifying a model of the code as input, using the template to specify the code.

  • XSLT: The design is specified as XML, and XSLT templates are used to create any number of output files. Generally, there is one system entity (e.g., class or database table) per XML input file.
  • Velocity and Jostraca: The templates read the design specification directly and then output code to match that specification.

The advantage over the code-driven approach is that while today these templates build EJBs, they could easily build JDO classes tomorrow, or C# the day after that. Keeping the model abstract makes portability a reality.

One downside to this approach is that each template is completely self-contained. There is no central code generator that is responsible for the interpretation of the design. This means that one template could interpret a date as just a date stamp, while another could interpret it as a date and time stamp. This is akin to the problems experienced with two-tier application servers where the business logic is not properly factored away from the display.

Another downside is that you are building a custom solution that will require team education. However, given the developing and fluid state of code-generation solutions today, even if you go with an existing solution, you will not often find engineers with extensive generation experience.

Model-Driven Approach: MDA

Model-Driven Architecture (MDA) is the Object Management Group (OMG) three-letter-acronym (TLA) initiative for code generation. I'm only slightly kidding; there are several TLA standards within MDA. The central idea is simple: turn a model in UML (Unified Modeling Language) into code (no, that's not a four-letter acronym).

Figure 4 shows the flow of an MDA generator:

Figure 4. The flow of an MDA generator

We start with the Platform Independent Model (PIM), created in a UML editing tool, like Poseidon for UML from Gentleware. (The PIM can be in an exported XML format called XMI.) A Platform-Specific Model (PSM) is then created using a transformation. Templates are applied to the PSM to create the output code.

It's easier to understand the difference between a PIM and a PSM in context. The PIM specifies the application business logic, for example, a table named book with these five fields. The PSM is a model of the implementation on a particular platform. In the EJB world, this is the set of UML models of the entity and session beans required to implement the book table.

The separation between the PIM and the PSM creates a well-factored generator that properly separates the design from the implementation specifics.

Some of the more popular MDA solutions are:

  • OptimalJ, an MDA generation tool from Compuware. They have just released a new version and started a major marketing push, including a study done by the Middleware Company where two teams developed the same application, one with MDA and one without. The MDA group finished 30% faster, even though they had to learn OptimalJ first. Impressive stuff.
  • AndroMDA, an open source MDA generator that reads XMI files and uses cartridges to build the various types of Java code. Changing persistence mechanisms, for example, is merely a matter of changing a cartridge.
  • MDE, a pragmatic MDA tool that goes directly from the PIM to code using a series of customizable generation components.
  • ArcStyler, an MDA generator from Germany that can build both Java and .NET code from the same design. The architect of ArcStyler is also the author of Convergent Architecture (Richard Hubert, Wiley, 2001), a book that integrates MDA into an entire design philosophy.

The value of MDA is that it is a set of standards that we can agree on and then work to improve. At the moment, though, MDA has some issues. First, the standards are more conceptual than standard, and are subject to interpretation. Second, UML is not complete enough to create business logic or to create efficient SQL schema, so it must be hinted at the PIM level. At the coding level, the current crop of MDA generators use "safe zones" in the code. These are specially marked sections of the code that are preserved between generation cycles. In this way you can extend the code directly to implement business logic that UML cannot specify.


Code generation is another link in the evolutionary chain of increasing abstraction. With it, you will quickly produce higher quality code, and thus be able to respond to changing requirements with ease. This is the true power of modern code generation.

Code Generation Resources

Outside of the direct links to the various generators, there are some more general online resources for code generation:

as well as some good books on generation:

  • My book, Code Generation in Action (Jack Herrington, Manning, 2003) covers a wide variety of code generation approaches at a practical level.
  • XDoclet in Action (Craig Walls and Norman Richards, Manning, 2003) covers every aspect of XDoclet in depth.
  • The Pragmatic Programmer (Dave Thomas and Andrew Hunt, Addison-Wesley, 1999) has several sections on active code generation.
  • MDA Explained (Anneke Kleppe, et al, Addison-Wesley, 2003) covers the MDA standards in a clear and succinct manner with the aid of a practical example.
  • Model Driven Architecture (David S. Frankel, Wiley, 2003) covers the MDA standards and provides an overview of MDA and code generation throughout the development lifecycle.

Jack Herrington is an engineer, author and presenter who lives and works in the Bay Area. His mission is to expose his fellow engineers to new technologies. That covers a broad spectrum, from demonstrating programs that write other programs in the book Code Generation in Action. Providing techniques for building customer centered web sites in PHP Hacks. All the way writing a how-to on audio blogging called Podcasting Hacks.

Return to ONJava.com.