XML Publishing with Cocoon 2, Part 1
Pages: 1, 2
Generators, XSP, and Data Flow in Cocoon
Now that you understand the sitemap and how matchers help us direct sitemap flow within Cocoon, it is appropriate to dive into a discussion of how data moves out of Cocoon and into a web browser. Each request made of Cocoon results in three distinct and closely related events happening:
- XML Generation
- XML Transformation (Optional)
- XML Serialization
Just as the request/response model is guaranteed in a web environment, this Cocoon model is a truth, an anchor to which you may tie yourself. Keep this in mind when learning Cocoon. Once the sitemap has resolved the request sufficiently, these three steps will be executed.
As mentioned previously, Cocoon is an XML-publishing framework. It uses SAX (Simple API for XML) transforms to enable this three-step process. Originally, Cocoon was structured around the DOM (Document Object Model) format, which was slower and used more memory. SAX is significantly faster and enables us to very easily make subtle changes to the XML during the XML publishing process. We'll get more into how SAX affects this process when discussing transformers in depth.
Our previous example used the example of a Cocoon app that served press releases:
...
<map:pipeline>
<map:match pattern="get/pressrelease/*">
<map:match pattern="goldCustomer" type="premium-customer">
<map:generate src="docs/pressreleases/{../1}.xml"/>
<map:transform src="xslt/premiumPressrelease2html.xsl"/>
<map:serialize type="html"/>
</map:match>
<!-- executed if not a gold customer -->
<map:generate src="docs/pressreleases/{1}.xml"/>
<map:transform src="xslt/pressrelease2html.xsl"/>
<map:serialize type="html"/>
</map:match>
</map:pipeline>
...
With our example URI,
http://hannonhill.com:8080/app/get/pressrelease/384792, and
assuming that the user requesting this resource is not a gold customer, the following is executed:
<map:generate src="docs/pressreleases/{1}.xml"/>
<map:transform src="xslt/pressrelease2html.xsl"/>
<map:serialize type="html"/>
The first line actually substitutes the matched string from the wildcard matcher, producing:
<map:generate src="docs/pressreleases/384792.xml"/>
<map:transform src="xslt/pressrelease2html.xsl"/>
<map:serialize type="html"/>
Note that in our pipeline, we did not specify which generator to use. How
then does Cocoon generate docs/pressreleases/384792.xml? It uses
the default generator defined in our sitemap components section
— the file generator. This generator reads a file off of
the disk, generating SAX events that are then processed by one or more
transformers, if any, and are then serialized. This means that whoever was
creating this press release would have to create a file named
docs/pressreleases/384792.xml in the webapp/
directory.
<map:components>
...
<map:generators default="file">
<map:generator name="file" label="content,data"
src="org.apache.cocoon.generation.FileGenerator" pool-max="32"
pool-min="16" pool-grow="4"/>
<map:generator name="serverpages" label="content,data"
src="org.apache.cocoon.generation.ServerPagesGenerator"/>
<map:generator name="status"
src="org.apache.cocoon.generation.StatusGenerator"/>
</map:generators>
...
</map:components>
XML is great for separating content from presentation. The XML file created by the press release author might look like this:
<press-release name="ContentXML Launched">
<title>ContentXML Launched at Internet World</title>
<author>David Cummings</author>
<body>...</body>
<press-release>
In our example, we run this XML through an XSL file, which converts it into HTML for viewing in a web browser. The serializer then transforms the SAX events into a byte stream and sends it back as the response to the initial request.
Custom Generators (XSP)
While exciting, this is not yet really ContentXML exciting, but we're almost
there :) The file generator is actually a Java class (more specifically, an
Avalon component) that reads in the contents of a file and generates the SAX
events from the file content. Cocoon gives the developer the ability to
generate these generators dynamically through the use of a markup language
called XSP, or eXtensible Server Pages. XSP is a subset of XML and provides
custom tags that the XSP generator (specifically, the serverpages
generator) recognizes and uses to generate a custom Java generator.
A custom XSP would do really nicely! Here we go:
<?xml version="1.0"?>
<xsp:page xmlns:xsp="http://apache.org/xsp">
<html>
<body>
<xsp:logic>
long time = System.currentTimeMillis();
</xsp:logic>
Hi There. The current time in milliseconds is <b><xsp:expr>time</xsp:expr></b>.
</body>
</html>
</xsp:page>
As you can see, this is just an XML file with some very specific XSP tags
that will get parsed and substituted. First off, every XSP file must have
xsp:page as its root element. That element will not get returned
into the resulting XML SAX stream. What will be returned is the
html element, followed by the body element, followed
by more stuff.
You can embed your own Java code in the XML by encapsulating it in
<xsp:logic> tags. The XSP generator will generate a Java class
with one very long generate() method, which Cocoon will eventually
invoke. Thus, the variables you define inside of your
<xsp:logic> tags will have visibility throughout the XSP
file.
There is one exception, though. <xsp:logic> tags defined
outside of the root content element (in this case, html) define
class-level Java code. In other words, if you place an
<xsp:logic> tag right after the <xsp:page>
element, the contained code will be placed outside of the generate
method. Typically, you would define any functions there. I recommend against
using variables in class-level Java code. The very nature of XSP is to define
dynamic XML generators. Having variables stick around could have unwanted side
effects.
Also worthy of notice is the <xsp:expr> tag. This tag allows
Java expression evaluation outside of the context of the
<xsp:logic> tags. In our example above,
<xsp:expr>time</xsp:expr> evaluates to the millisecond value
of the current time.
Another typical use of XSP is to loop over a result set, producing dynamic XML:
<members>
<xsp:logic>
String[] members = getClubMemberNames(); // defined elsewhere.
for (int i = 0; i < members.length; i++) {
</xsp:logic>
<member>
<xsp:attribute name="name"><xsp:expr>members[i]</xsp:expr></xsp:attribute>
</member>
<xsp:logic>
}
</xsp:logic>
</members>
which produces something like this:
<members>
<member name="hannonhill.com"/>
<member name="contentxml.com"/>
<member name="superupdate.com"/>
<member name="zapedit.com"/>
</members>
Remember that the XSP page is XML, so inserting superfluous
< and such characters is a big no-no. In fact, the XSP
generator will throw an exception telling you so when it tries to compile the
XSP XML into a Java Generator. If you need these characters in your Java code,
you must use the XML-escaped equivalents. For instance, in our example:
for (int i = 0; i < members.length; i++)
This is definitely one of the tradeoffs of using XSPs, albeit a relatively small one.
Using XSP for Form Generation
From an application development perspective, form creation has always been approached with both brute force and elegance. We like to think that XSP provides a healthy lean towards the latter! For the purpose of this discussion, assume that all of our form generation goes through three steps:
- XSP generation of form elements.
- Post-processing through an XSL stylesheet.
- Serialization to HTML.
Why is step #2 necessary? Why not generate all of your HTML straight from the XSP? Isn't less better? In this case, the advantages you get by abstracting display logic out of the generator and into the transform is leagues better. Most importantly, all of your display logic is in one place. One change will have application-wide propagation effects, and this has a huge impact on application maintainability.
Assume that a user of your content management solution is trying to log in,
hitting the URI /login. In the sitemap, this matches the
following:
<map:match pattern="login">
<map:generate src="docs/login.xsp" type="serverpages"/>
<map:transform src="common/stylesheets/display2html"/>
<map:serialize type="html"/>
</map:match>
In fact, this is how Cocoon applications generate most of their form content! docs/login.xsp could look something like this:
<?xml version="1.0"?>
<xsp:page xmlns:xsp="http://apache.org/xsp">
<form>
<title>Log in to ContentXML</title>
<action>do/submit/login</action>
<form-item type="text">
<name>loginName</name>
<output>Login Name:</output>
</form-item>
<form-item type="password">
<name>password</name>
<output>Password:</output>
</form-item>
<form-item type="submit"/>
</form>
</xsp:page>
As you can see, there is no HTML anywhere in the XML that is generated from
this XSP. When the XML is generated, it is then transformed by
common/stylesheets/display2html.xsl, which "understands" the
<form> XML element, processing its sub-elements accordingly and
applying all applicable formatting to produce a pretty form with a login text
field, a password field, a submit button, and anything else you may desire.
Imagine having 150 forms scattered around your application. Changing how display2html.xsl processes form XML will instantly change all of the forms in your application. In fact, we use this model to control the overall look and feel of our application. It takes us almost no time to completely change how the app looks, making it easy to rebrand and update! Aces!
David Cummings is the CEO of Hannon Hill Corporation which focuses on content management software solutions.
Collin VanDyck is the lead developer of ContentXML and an integral part of the Hannon Hill team.
Return to ONJava.com.