ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

XSLT Processing with Java
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9

SAXON, Xalan, or TrAX?

As the previous examples show, SAXON and Xalan have many similarities. While similarities make learning the various APIs easy, they do not result in portable code. If you write code directly against either of these interfaces, you lock yourself into that particular implementation unless you want to rewrite your application.



The other option is to write a facade around both processors, presenting a consistent interface that works with either processor behind the scenes. The only problem with this approach is that as new processors are introduced, you must update the implementation of your facade. It would be very difficult for one individual or organization to keep up with the rapidly changing world of XSLT processors.

But if the facade was an open standard and supported by a large enough user base, the people and organizations that write the XSLT processors would feel pressure to adhere to the common API, rather than the other way around. TrAX was initiated in early 2000 as an effort to define a consistent API to any XSLT processor. Since some of the key people behind TrAX were also responsible for implementing some of the major XSLT processors, it was quickly accepted that TrAX would be a de facto standard, much in the way that SAX is.

Introduction to JAXP 1.1

TrAX was a great idea, and the original work and concepts behind it were absorbed into JAXP Version 1.1. If you search for TrAX on the Web and get the feeling that the effort is waning, this is only because focus has shifted from TrAX to JAXP. Although the name has changed, the concept has not: JAXP provides a standard Java interface to many XSLT processors, allowing you to choose your favorite underlying implementation while retaining portability.

First released in March 2000, Sun's JAXP 1.0 utilized XML 1.0, XML Namespaces 1.0, SAX 1.0, and DOM Level 1. JAXP is a standard extension to Java, meaning that Sun provides a specification through its Java Community Process (JCP) as well as a reference implementation. JAXP 1.1 follows the same basic design philosophies of JAXP 1.0, adding support for DOM Level 2, SAX 2, and XSLT 1.0. A tool like JAXP is necessary because the XSLT specification defines only a transformation language; it says nothing about how to write a Java XSLT processor. Although they all perform the same basic tasks, every processor uses a different API and has its own set of programming conventions.

JAXP is not an XML parser, nor is it an XSLT processor. Instead, it provides a common Java interface that masks differences between various implementations of the supported standards. When using JAXP, your code can avoid dependencies on specific vendor tools, allowing flexibility to upgrade to newer tools when they become available.

The key to JAXP's design is the concept of plugability layers. These layers provide consistent Java interfaces to the underlying SAX, DOM, and XSLT implementations. In order to utilize one of these APIs, you must obtain a factory class without hardcoding Xalan or SAXON code into your application. This is accomplished via a lookup mechanism that relies on Java system properties. Since three separate plugability layers are used, you can use a DOM parser from one vendor, a SAX parser from another vendor, and yet another XSLT processor from someone else. In reality, you will probably need to use a DOM parser compatible with your XSLT processor if you try to transform the DOM tree directly. Figure 5-1 illustrates the high-level architecture of JAXP 1.1.

Diagram.
Figure 5-1. JAXP 1.1 architecture

As shown, application code does not deal directly with specific parser or processor implementations, such as SAXON or Xalan. Instead, you write code against abstract classes that JAXP provides. This level of indirection allows you to pick and choose among different implementations without even recompiling your application.

The main drawback to an API such as JAXP is the "least common denominator" effect, which is all too familiar to AWT programmers. In order to maximize portability, JAXP mostly provides functionality that all XSLT processors support. This means, for instance, that Xalan's custom XPath APIs are not included in JAXP. In order to use value-added features of a particular processor, you must revert to nonportable code, negating the benefits of a plugability layer. Fortunately, most common tasks are supported by JAXP, so reverting to implementation-specific code is the exception, not the rule.

Although the JAXP specification does not define an XML parser or XSLT processor, reference implementations do include these tools. These reference implementations are open source Apache XML tools, (Crimson and Xalan) so complete source code is available.

JAXP 1.1 Implementation

You guessed it...we will now reimplement the simple example using Sun's JAXP 1.1. Behind the scenes, this could use any JAXP 1.1-compliant XSLT processor; this code was developed and tested using Apache's Xalan 2 processor. Example 5-3 contains the complete source code.


Example 5-3: SimpleJaxp.java

package chap5;
 
import java.io.*;
 
/**
* A simple demo of JAXP 1.1
*/
public class SimpleJaxp {
 
  /**
   * Accept two command line arguments: the name of
   * an XML file, and the name of an XSLT stylesheet.
   * The result of the transformation
   * is written to stdout.
   */
  public static void main(String[] args)
      throws javax.xml.transform.TransformerException {
    if (args.length != 2) {
      System.err.println("Usage:");
      System.err.println(" java " + SimpleJaxp.class.getName( )
          + " xmlFileName xsltFileName");
      System.exit(1);
    }
 
    File xmlFile = new File(args[0]);
    File xsltFile = new File(args[1]);
 
    javax.xml.transform.Source xmlSource =
        new javax.xml.transform.stream.StreamSource(xmlFile);
    javax.xml.transform.Source xsltSource =
        new javax.xml.transform.stream.StreamSource(xsltFile);
    javax.xml.transform.Result result =
        new javax.xml.transform.stream.StreamResult(System.out);
 
    // create an instance of TransformerFactory
    javax.xml.transform.TransformerFactory transFact =
        javax.xml.transform.TransformerFactory.newInstance( );
 
    javax.xml.transform.Transformer trans =
        transFact.newTransformer(xsltSource);
 
    trans.transform(xmlSource, result);
  }
}


As in the earlier examples, explicit package names are used in the code to point out which classes are parts of JAXP. In future examples, import statements will be favored because they result in less typing and more readable code. Our new program begins by declaring that it may throw TransformerException:

public static void main(String[] args)
    throws javax.xml.transform.TransformerException {

This is a general-purpose exception representing anything that might go wrong during the transformation process. In other processors, SAX-specific exceptions are typically propagated to the caller. In JAXP, TransformerException can be wrapped around any type of Exception object that various XSLT processors may throw.

Next, the command-line arguments are converted into File objects. In the SAXON and Xalan examples, we created a system ID for each of these files. Since JAXP can read directly from a File object, the extra conversion to a URI is not needed:

File xmlFile = new File(args[0]);
File xsltFile = new File(args[1]);
 
javax.xml.transform.Source xmlSource =
  new javax.xml.transform.stream.StreamSource(xmlFile);
javax.xml.transform.Source xsltSource =
  new javax.xml.transform.stream.StreamSource(xsltFile);

The Source interface is used to read both the XML file and the XSLT file. Unlike the SAX InputSource class or Xalan's XSLTInputSource class, Source is an interface that can have many implementations. In this simple example we are using StreamSource, which has the ability to read from a File object, an InputStream, a Reader, or a system ID. Later we will examine additional Source implementations that use SAX and DOM as input. Just like Source, Result is an interface that can have several implementations. In this example, a StreamResult sends the output of the transformations to System.out:

javax.xml.transform.Result result =
  new javax.xml.transform.stream.StreamResult(System.out);

Next, an instance of TransformerFactory is created:

javax.xml.transform.TransformerFactory transFact = javax.xml.transform.TransformerFactory.newInstance( );

The TransformerFactory is responsible for creating Transformer and Template objects. In our simple example, we create a Transformer object:

javax.xml.transform.Transformer trans = transFact.newTransformer(xsltSource);

Transformer objects are not thread-safe, although they can be used multiple times. For a simple example like this, we will not encounter any problems. In a threaded servlet environment, however, multiple users cannot concurrently access the same Transformer instance. JAXP also provides a Templates interface, which represents a stylesheet that can be accessed by many concurrent threads.

The transformer instance is then used to perform the actual transformation:

trans.transform(xmlSource, result);

This applies the XSLT stylesheet to the XML data, sending the result to System.out.

XSLT Plugability Layer

JAXP 1.1 defines a specific lookup procedure to locate an appropriate XSLT processor. This must be accomplished without hardcoding vendor-specific code into applications, so Java system properties and JAR file service providers are used. Within your code, first locate an instance of the TransformerFactory class as follows:

javax.xml.transform.TransformerFactory transFact = javax.xml.transform.TransformerFactory.newInstance( );

Since TransformerFactory is abstract, its newInstance( ) factory method is used to instantiate an instance of a specific subclass. The algorithm for locating this subclass begins by looking at the javax.xml.transform.TransformerFactory system property. Let us suppose that com.foobar.AcmeTransformer is a new XSLT processor compliant with JAXP 1.1. To utilize this processor instead of JAXP's default processor, you can specify the system property on the command line when you start your Java application:

java -Djavax.xml.transform.TransformerFactory=com.foobar.AcmeTransformer MyApp

Provided that JAXP is able to instantiate an instance of AcmeTransformer, this is the XSLT processor that will be used. Of course, AcmeTransformer must be a subclass of TransformerFactory for this to work, so it is up to vendors to offer support for JAXP.

If the system property is not specified, JAXP next looks for a property file named lib/jaxp.properties in the JRE directory. A property file consists of name=value pairs, and JAXP looks for a line like this:

javax.xml.transform.TransformerFactory=com.foobar.AcmeTransformer

You can obtain the location of the JRE with the following code:

String javaHomeDir = System.getProperty("java.home");

TIP: Some popular development tools change the value of the java.home when they are installed, which could prevent JAXP from locating jaxp.properties. JBuilder, for instance, installs its own version of Java 2 that it uses by default.

The advantage of creating jaxp.properties in this directory is that you can use your preferred processor for all of your applications that use JAXP without having to specify the system property on the command line. You can still override this file with the -D command-line syntax, however.

If jaxp.properties is not found, JAXP uses the JAR file service provider mechanism to locate an appropriate subclass of TransformerFactory. The service provider mechanism is outlined in the JAR file specification from Sun and simply means that you must create a file in the META-INF/services directory of a JAR file. In JAXP, this file is called javax.xml.transform.TransformerFactory. It contains a single line that specifies the implementation of TransformerFactory: com.foobar.AcmeTransformer in our fictitious example. If you look inside of xalan.jar in JAXP 1.1, you will find this file. In order to utilize a different parser that follows the JAXP 1.1 convention, simply make sure its JAR file is located first on your CLASSPATH.

Finally, if JAXP cannot find an implementation class from any of the three locations, it uses its default implementation of TransformerFactory. To summarize, here are the steps that JAXP performs when attempting to locate a factory:

  1. Use the value of the javax.xml.transform.TransformerFactory system property if it exists.

  2. If JRE/lib/jaxp.properties exists, then look for a javax.xml.transform.TransformerFactory=ImplementationClass entry in that file.

  3. Use a JAR file service provider to look for a file called META-INF/services/javax.xml.transform.TransformerFactory in any JAR file on the CLASSPATH.

  4. Use the default TransformerFactory instance.

The JAXP 1.1 plugability layers for SAX and DOM follow the exact same process as the XSLT layer, only they use the javax.xml.parsers.SAXParserFactory and javax.xml.parsers.DocumentBuilderFactory system properties respectively. It should be noted that JAXP 1.0 uses a much simpler algorithm where it checks only for the existence of the system property. If that property is not set, the default implementation is used.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9

Next Pagearrow