ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

XML Publishing with Cocoon 2, Part 1
Pages: 1, 2

Generators, XSP, and Data Flow in Cocoon

Now that you understand the sitemap and how matchers help us direct sitemap flow within Cocoon, it is appropriate to dive into a discussion of how data moves out of Cocoon and into a web browser. Each request made of Cocoon results in three distinct and closely related events happening:



  1. XML Generation
  2. XML Transformation (Optional)
  3. XML Serialization

Just as the request/response model is guaranteed in a web environment, this Cocoon model is a truth, an anchor to which you may tie yourself. Keep this in mind when learning Cocoon. Once the sitemap has resolved the request sufficiently, these three steps will be executed.

As mentioned previously, Cocoon is an XML-publishing framework. It uses SAX (Simple API for XML) transforms to enable this three-step process. Originally, Cocoon was structured around the DOM (Document Object Model) format, which was slower and used more memory. SAX is significantly faster and enables us to very easily make subtle changes to the XML during the XML publishing process. We'll get more into how SAX affects this process when discussing transformers in depth.

Our previous example used the example of a Cocoon app that served press releases:

...
<map:pipeline>
        <map:match pattern="get/pressrelease/*">
               <map:match pattern="goldCustomer" type="premium-customer">
                       <map:generate src="docs/pressreleases/{../1}.xml"/>
                       <map:transform src="xslt/premiumPressrelease2html.xsl"/>
                       <map:serialize type="html"/>
               </map:match>
               <!-- executed if not a gold customer -->
               <map:generate src="docs/pressreleases/{1}.xml"/>
               <map:transform src="xslt/pressrelease2html.xsl"/>
               <map:serialize type="html"/>
        </map:match>
</map:pipeline>
...

With our example URI, http://hannonhill.com:8080/app/get/pressrelease/384792, and assuming that the user requesting this resource is not a gold customer, the following is executed:

<map:generate src="docs/pressreleases/{1}.xml"/>
<map:transform src="xslt/pressrelease2html.xsl"/>
<map:serialize type="html"/>

The first line actually substitutes the matched string from the wildcard matcher, producing:

<map:generate src="docs/pressreleases/384792.xml"/>
<map:transform src="xslt/pressrelease2html.xsl"/>
<map:serialize type="html"/>

Note that in our pipeline, we did not specify which generator to use. How then does Cocoon generate docs/pressreleases/384792.xml? It uses the default generator defined in our sitemap components section — the file generator. This generator reads a file off of the disk, generating SAX events that are then processed by one or more transformers, if any, and are then serialized. This means that whoever was creating this press release would have to create a file named docs/pressreleases/384792.xml in the webapp/ directory.

<map:components>
...
<map:generators default="file">
   <map:generator name="file" label="content,data"
         src="org.apache.cocoon.generation.FileGenerator" pool-max="32" 
         pool-min="16" pool-grow="4"/>
   <map:generator name="serverpages" label="content,data"
         src="org.apache.cocoon.generation.ServerPagesGenerator"/>
   <map:generator name="status"
         src="org.apache.cocoon.generation.StatusGenerator"/>
 </map:generators>
...
</map:components>

XML is great for separating content from presentation. The XML file created by the press release author might look like this:

<press-release name="ContentXML Launched">
   <title>ContentXML Launched at Internet World</title>
   <author>David Cummings</author>
   <body>...</body>
<press-release>

In our example, we run this XML through an XSL file, which converts it into HTML for viewing in a web browser. The serializer then transforms the SAX events into a byte stream and sends it back as the response to the initial request.

Custom Generators (XSP)

While exciting, this is not yet really ContentXML exciting, but we're almost there :) The file generator is actually a Java class (more specifically, an Avalon component) that reads in the contents of a file and generates the SAX events from the file content. Cocoon gives the developer the ability to generate these generators dynamically through the use of a markup language called XSP, or eXtensible Server Pages. XSP is a subset of XML and provides custom tags that the XSP generator (specifically, the serverpages generator) recognizes and uses to generate a custom Java generator.

A custom XSP would do really nicely! Here we go:

<?xml  version="1.0"?>
<xsp:page xmlns:xsp="http://apache.org/xsp">
   <html>
   <body>
      <xsp:logic>
         long time = System.currentTimeMillis();
      </xsp:logic>
      Hi There. The current time in milliseconds is <b><xsp:expr>time</xsp:expr></b>.
   </body>
   </html>
</xsp:page>

As you can see, this is just an XML file with some very specific XSP tags that will get parsed and substituted. First off, every XSP file must have xsp:page as its root element. That element will not get returned into the resulting XML SAX stream. What will be returned is the html element, followed by the body element, followed by more stuff.

You can embed your own Java code in the XML by encapsulating it in <xsp:logic> tags. The XSP generator will generate a Java class with one very long generate() method, which Cocoon will eventually invoke. Thus, the variables you define inside of your <xsp:logic> tags will have visibility throughout the XSP file.

There is one exception, though. <xsp:logic> tags defined outside of the root content element (in this case, html) define class-level Java code. In other words, if you place an <xsp:logic> tag right after the <xsp:page> element, the contained code will be placed outside of the generate method. Typically, you would define any functions there. I recommend against using variables in class-level Java code. The very nature of XSP is to define dynamic XML generators. Having variables stick around could have unwanted side effects.

Also worthy of notice is the <xsp:expr> tag. This tag allows Java expression evaluation outside of the context of the <xsp:logic> tags. In our example above, <xsp:expr>time</xsp:expr> evaluates to the millisecond value of the current time.

Another typical use of XSP is to loop over a result set, producing dynamic XML:

<members>
<xsp:logic>
String[] members = getClubMemberNames(); // defined elsewhere.
for (int i = 0; i &lt; members.length; i++) {
   </xsp:logic>
      <member>
         <xsp:attribute name="name"><xsp:expr>members[i]</xsp:expr></xsp:attribute>
      </member>
   <xsp:logic>
}
</xsp:logic>
</members>

which produces something like this:

<members>
  <member name="hannonhill.com"/>
  <member name="contentxml.com"/>
  <member name="superupdate.com"/>
  <member name="zapedit.com"/>
</members>

Remember that the XSP page is XML, so inserting superfluous < and such characters is a big no-no. In fact, the XSP generator will throw an exception telling you so when it tries to compile the XSP XML into a Java Generator. If you need these characters in your Java code, you must use the XML-escaped equivalents. For instance, in our example:

for (int i = 0; i &lt; members.length; i++)

This is definitely one of the tradeoffs of using XSPs, albeit a relatively small one.

Using XSP for Form Generation

From an application development perspective, form creation has always been approached with both brute force and elegance. We like to think that XSP provides a healthy lean towards the latter! For the purpose of this discussion, assume that all of our form generation goes through three steps:

  1. XSP generation of form elements.
  2. Post-processing through an XSL stylesheet.
  3. Serialization to HTML.

Why is step #2 necessary? Why not generate all of your HTML straight from the XSP? Isn't less better? In this case, the advantages you get by abstracting display logic out of the generator and into the transform is leagues better. Most importantly, all of your display logic is in one place. One change will have application-wide propagation effects, and this has a huge impact on application maintainability.

Assume that a user of your content management solution is trying to log in, hitting the URI /login. In the sitemap, this matches the following:

<map:match pattern="login">
   <map:generate src="docs/login.xsp" type="serverpages"/>
   <map:transform src="common/stylesheets/display2html"/>
   <map:serialize type="html"/>
</map:match>

In fact, this is how Cocoon applications generate most of their form content! docs/login.xsp could look something like this:

<?xml version="1.0"?>
<xsp:page xmlns:xsp="http://apache.org/xsp">
   <form>
      <title>Log in to ContentXML</title>
      <action>do/submit/login</action>
      <form-item type="text">
         <name>loginName</name>
         <output>Login Name:</output>
      </form-item>
      <form-item type="password">
         <name>password</name>
         <output>Password:</output>
      </form-item>
      <form-item type="submit"/>
   </form>
</xsp:page>

As you can see, there is no HTML anywhere in the XML that is generated from this XSP. When the XML is generated, it is then transformed by common/stylesheets/display2html.xsl, which "understands" the <form> XML element, processing its sub-elements accordingly and applying all applicable formatting to produce a pretty form with a login text field, a password field, a submit button, and anything else you may desire.

Imagine having 150 forms scattered around your application. Changing how display2html.xsl processes form XML will instantly change all of the forms in your application. In fact, we use this model to control the overall look and feel of our application. It takes us almost no time to completely change how the app looks, making it easy to rebrand and update! Aces!

David Cummings is the CEO of Hannon Hill Corporation which focuses on content management software solutions.

Collin VanDyck is the lead developer of ContentXML and an integral part of the Hannon Hill team.


Return to ONJava.com.