ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Using Hierarchical Data Sets with Aspire and Tomcat

by Satya Komatineni
03/05/2003

What are Hierarchical Data Sets and Why Do You Care?

Hierarchical Data Sets are not new. They already exist in the form of CICS transactional data, files in directories, and plain Java objects, as well as the obvious XML. In the XML Journal in early 2001, I floated the idea that programmers can benefit from hierarchical data abstractions even though many of their data sources are predominantly relational (such as databases including MySQL, Oracle, SQL Server, DB2, etc.).

The .NET world has a similar idea taking root in the notion of "datasets." Although there are important differences between my proposed Hierarchical Data Sets and the nature of Microsoft's datasets, it is evident that Hierarchical Data Sets enhance relational abstractions with richer detail.

This article examines the structure of, and a Java API for, Hierarchical Data Sets. Unlike the XML Journal reference two years ago, you will now actually have a piece of executable code to use to start taking advantage of Hierarchical Data Sets. Although programmers can code in Java to access various data sources and construct the final Hierarchical Data Set, this article has an implementation that you can readily use to construct these Hierarchical Data Sets declaratively by simply composing pre-built relational adapters. Relational adapters include file readers, SQL readers, Stored Procedure readers, et cetera.

Related Reading

Java & XML Data Binding
By Brett McLaughlin

The question you're probably asking is "What good are these Hierarchical Data Sets?" Although they can't rival the salutary effects of large expensive pieces of Carbon on your most certainly deserving companions, Hierarchical Data Sets are quite useful in the programming world. For starters, an entire HTML page worth of data can be satisfied by a single Hierarchical Data Set. In an MVC model, a controller servlet can deliver a Hierarchical Data Set to a JSP page, which will paint it without further ado. For a warmup, it can be converted to XML and directly returned to the caller by the controller servlet. For the appeal, the Hierarchical Data Set can be converted to Excel. For the stylish, the Hierarchical Data Set can be redirected to a reporting engine or a charting engine that supports XML data.

Although the primary focus of the article is the Java programming API for Java programmers, Hierarchical Data Sets can be used by non-Java programmers quite effectively to obtain XML, HTML, or Excel formats directly from relational databases and other data sources by using a J2EE server such as Tomcat. Without further ado, let us investigate the structure of Hierarchical Data Sets and see how these data sets can be obtained declaratively (while relaxing your programming muscles a bit).

Structure of Hierarchical Data

A Hierarchical Data Structure can be conceptually represented as a Java API, or XML, or some other format. It is easiest to visualize as XML.

<AspireDataSet>
    <!-- A set of key value pairs at the root level -->
    <key1>val1</key1>   
    <key2>val2</key2>

    <!-- A set of named loops -->
    <loop name="loop">
    </loop>
    <loop name="loop2">
    </loop>
</AspireDataSet>

This is a set of key/value pairs. A given set of key/value pairs could yield n independent loops. Each loop is essentially a table of data. The term "loop" is synonymous with "table." I haven't used "table" because people might literally take "table" to mean only data from a relational table. Having mentioned that is a collection of rows (RowSet!), let us look closer at the structure of a loop:

<loop name="loopname">
    <row>
        <!-- a set of key value pairs -->
        <key1>val1</key1>
        <key2>val2</key2>

        <!-- a set of named loops -->
        <loop name="loopname1">
        </loop>

        <!-- a set of named loops -->
        <loop name="loopname2">
        </loop>
    </row>
    <row>
    </row>
</loop>

The only odd thing here is the structure of a row. A row is, expectedly, a collection of key/value pairs. Here a row includes not only key/value pairs, but also another recursive set of n number of independent loops. This extension can produce trees with any amount of depth. (Or should I say, height!)

Structure of Hierarchical Data in Java

The moment I showed the hierarchical data as XML, there is a possibility that people might take a Hierarchical Data Set to be literally XML and, hence, literally DOM and, hence, a lot of memory inside of the JVM. No need to panic. The Hierarchical Data Set can have its own Java API and need not be represented as a DOM. The majority of the time it is a forward-only-traversing-cursor-like-lazy-loading tree. Here is a working Java API for a Hierarchical Data Set:

package com.ai.htmlgen;
import com.ai.data.*;

/**
 * Represents a Hierarchical Data Set.
 * An hds is a collection of rows.
 * You can step through the rows using ILoopForwardIterator
 * You can find out about the columns via IMetaData.
 * An hds is also a collection loops originated using the current row.
 */
public interface ihds extends ILoopForwardIterator
{
    /**
     * Returns the parent if available
     * Returns null if there is no parent
     */
    public ihds getParent() throws DataException;

    /**
     * For the current row return a set of 
     * child loop names. ILoopForwardIteraor determines
     * what the current row is.
     * 
     * @see ILoopForwardIterator
     */
    public IIterator getChildNames() throws DataException;

    /**
     * Given a child name return the child Java object 
     * represented by ihds again
     */
    public ihds getChild(String childName) throws DataException;

    /**
     * returns a column that is similar to SUM, AVG etc of a 
     * set of rows that are children to this row.
     */
    public String getAggregateValue(String keyname) throws DataException;

    /**
     * Returns the column names of this loop or table.
     * @see IMetaData
     */
    public IMetaData getMetaData() throws DataException;

    /**
     * Releases any resources that may be held by this loop of data 
     * or table.
     */
    public void close() throws DataException;
}

For brevity, the Java interface ihds represents "Interface to Hierarchical Data Set." This API allows you to step through your loops recursively. An implementation has the option to load the loops only when they are requested. It can also assume either forward-only or random traversal. Before going further, let me present the two additional interfaces that this API uses: ILoopForwardIterator and IMetaData.

How to Move Through a Series of Rows in HDS: ILoopForwardIterator

package com.ai.htmlgen;
import com.ai.data.*;

public interface ILoopForwardIterator
{
   /**
    * getValue from the current row matching the key
    */
   public String getValue(final String key);

   public void moveToFirst() throws DataException;

   public void moveToNext() throws DataException;

   public boolean isAtTheEnd() throws DataException;
}

IMetaData: For Reading Column Names

package com.ai.data;
public interface IMetaData 
{
	public IIterator getIterator();

	public int       getColumnCount();

	public int       getIndex(final String attributeName)
		throws FieldNameNotFoundException;
}

How Can You Obtain a Hierarchical Data Set, So You Can Use It?

Now that we know the structure of Hierarchical Data Set, how do you get hold of one? As I stated earlier, this is easy under Aspire. The steps are as follows:

  1. Learn the basics of Aspire.
  2. Create a definition file for your Hierarchical Data Set.
  3. Call your definition and receive ihds in your Java code.

Each of these steps is explained in some detail below.

Read the Basics on the Usage of the Aspire JAR

Aspire is a small JAR file that can complement your Java programming, particularly when used with an app server such as Tomcat. At the heart of Aspire is a set of configuration files, where you declare your data access mechanisms in terms of Java classes and arguments to those Java classes. Aspire will execute those Java classes and return the resulting objects. Hierarchical Data Sets are no exception.

An earlier O'Reilly article introduced Aspire: "For Tomcat Developers, Aspire Comes in a JAR." This will familiarize you with defining databases and calling SQL and Stored Procedures, as well as configuring and initializing Aspire.

Create a Definition File For your Hierarchical Data Set

A sample definition for a Hierarchical Data Set is as follows:

###################################
# ihdsTest data definition: section1
###################################
request.ihdsTest.className=com.ai.htmlgen.DBHashTableFormHandler1
request.ihdsTest.loopNames=works

#section2
request.ihdsTest.works.class_request.className=com.ai.htmlgen.GenericTableHandler6
request.ihdsTest.works.loopNames=childloop1
request.ihdsTest.works.query_request.className=com.ai.data.RowFileReader
request.ihdsTest.works.query_request.filename=aspire:\\samples
            \\pop-table-tags\\properties\\pop-table.data

#section3
request.childloop1.class_request.classname=com.ai.htmlgen.GenericTableHandler6
request.childloop1.query_request.classname=com.ai.data.RowFileReader
request.childloop1.query_request.filename=aspire:\\samples\\pop-table-tags
           \\properties\\pop-table.data

This definition has three sections. The data set is named ihdsTest. The first section tells Aspire that the Java class com.ai.htmlgen.DBHashTableFormHandler1 is responsible for returning an object implementing ihds. Unless you code your own implementation of ihds, you will use this class in every data set definition. It's the pre-fabricated class that knows how to compose relational assets into hierarchical assets. Line 2 of section 1 tells DBHashTableFormHandler1 that this main data set has one loop called works.

Section2 defines the loop works. A loop structure in Aspire uses two Java classes: a class request (GenericTableHandler6) and a Query request (RowFileReader). RowFileReader reads a set of records from a flat file and makes them look like a collection of rows and columns. GenericTableHandler6 takes this collection and applies such features as aggregate values and row numbers and implements the ihds interface at the loop level. As with DBHashtableFormHandler1, GenericTableHandler6 is present in most definitions. RowFileReader might change, depending on your data sources. For example, the following parts exist in this category:

  1. RowFileReader.
  2. DBRequestExecutor2 (for reading SQL).
  3. StoredProcedureExecutor2 (for reading from Stored Procedures).
  4. XMLReader (for reading XML files).
  5. Or, you can write your own reader that implements IDataCollection.

Section2 also indicates that it has a child called childloop1. GenericTableHandler6 will take this cue and look for section3, identified by childloop1.

Section3 defines childloop1. The definition is identical to section2, except that childloop1 has no children. Both section2 and section3 use RowFileReaders. In practice, they can use any combination of data reader parts.

Let me call this file ihds-test.properties. Include this file in Aspire's master aspire.properties as follows:

application.includeFiles=aspire:\\samples\\hello-world
             \\properties\\hello-world.properties,\
aspire:\\samples\\ihds-test\\ihds-test.properties,\
aspire:\\samples\\xml-reader\\xml-reader.properties

For the sake of completeness, I have included a couple of lines above and below that inclusion process.

Call your Definition and Receive an ihds

Now that we have the definition, how do we call it from Java? Reading that first article will help considerably, but here is the Java code:

Hashtable args = new Hashtable();
args.put("key1".toLowerCase(), "value1");

IFactory factory = AppObjects.getFactory();
ihds hds         = (ihds)factory.getObject("ihdsTest",args);

// use ihds

Aspire has a factory service, represented by the IFactory interface. This factory interface allows you to call a Java class, identified by a symbolic name called ihdsTest, with any arguments passed in as a hashtable. The arguments are expected to be lowercase strings for the downstream relational adapters.

Example Code for Exploring the ihds API

The following code will walk through the ihds tree, printing it out:

import com.ai.htmlgen.*;
import com.ai.common.TransformException;
import Java.io.*;
import com.ai.data.*;

	// above code removed for clarity 

    public static void staticTransform(ihds data, PrintWriter out) 
           throws TransformException
    {
        try
        {
            writeALoop("MainData",data,out,"");
        }
        catch(DataException x)
        {
            throw new TransformException(
				"Error: DebugTextTransform: Data Exception",x);
        }
    }
    
    /**********************************************************
     * A recursive function to write out a loop worth of ihds
     **********************************************************
     */
     
    private static void writeALoop(
		String loopname, ihds data, PrintWriter out, String is)
            throws DataException
    {
        println(out,is, ">> Writing data for loop:" + loopname);

        // write metadata
        IMetaData m = data.getMetaData();
        IIterator columns = m.getIterator();

        StringBuffer colBuffer = new StringBuffer();
        for(columns.moveToFirst();!columns.isAtTheEnd();columns.moveToNext())
        {
            String columnName = (String)columns.getCurrentElement();
            colBuffer.append(columnName).append("|");
        }
        println(out,is,colBuffer.toString());

        //write individual rows
        for(data.moveToFirst();!data.isAtTheEnd();data.moveToNext())
        {
            StringBuffer rowBuffer = new StringBuffer();
            for(columns.moveToFirst();!columns.isAtTheEnd();columns.moveToNext())
            {
                String columnName = (String)columns.getCurrentElement();
                rowBuffer.append(data.getValue(columnName));
                rowBuffer.append("|");
            }
            println(out,is,rowBuffer.toString());

            // recursive call to print children
            IIterator children = data.getChildNames();
            for(children.moveToFirst();!children.isAtTheEnd();children.moveToNext())
            {
                // for each child
                String childName = (String)children.getCurrentElement();
                ihds child = data.getChild(childName);
                writeALoop(childName,child,out,is + "\t");
            }
        }

        println(out,is,">> Writing data for loop:" + loopname + " is complete");
    }
    
    private static void println(PrintWriter out, String indentationString, 
	                            String line)
    {
        out.print(indentationString);
        out.print(line);
        out.print("\n");
    }

    // code removed for clarity

How to Use ihds Under Tomcat

The facilities presented so far demonstrate accessing Hierarchical Data Sets anywhere in Java code, including command-line applications. When Aspire is initalized under Tomcat, it goes a step further and allows you to include data sets directly in your web pages. Currently supported formats include classic XML, object XML, text, and Excel data. Formats planned for the near future include Java class definitions to match the object XML, XSD, and generic HTML pages.

Before being able to obtain your web pages in one of these formats, you need to know how to initialize Aspire under Tomcat. Besides the article referenced above, see "Improve Your Career with Tomcat and Aspire." Once this is accomplished, your remaining work is to:

  1. Link your Hierarchical Data Set definition to a URL in the configuration file.
  2. Invoke the URL with the desired type of data format.

Link Your Hierarchical Data Set Definition to a URL

Add this section to the existing data definition configuration file:

###################################
# ihdsTestURL: linking to a URL
###################################
ihdsTestURL=aspire:\\samples\\ihds-test\\ihds-default-html-template.html
ihdsTestURL.formHandlerName=ihdsTest
request.ihdsTest.form_handler.class_request.className=
         com.ai.htmlgen.DBHashTableFormHandler1

There are two parts to a URL defined in Aspire: the data source and the data transformation. Aspire can transform data using JSP, XSLT, or tags. The default transformation, tags, requires a template filename that includes the tags. The first line indicates a transformation file for the data. The second line points to a data definition called ihdsTest, which is defined down the line. Line 3 says essentially the same thing as line 1 of section 1. This discrepancy is due to some backward compatibility with Aspire.

Ensure that the Java Classes Responsible for Transformations Are Present in the Properties Files

Aspire allows a Hierarchical Data Set to be transformed in one of two ways, generic or page-specific. This example definition is page-specific, because the presented HTML template is specific to that page. A generic transformation will take any Hierarchical Data Set belonging to any page and transform it in a generic manner. Generic transformations are included in the configuration file as follows:

# Generic transform support
# XML output
GenericTransform.Classic-xml.classname=
           com.ai.xml.FormHandlerToXMLTransform
GenericTransform.Object-xml.classname=
           com.ai.generictransforms.ObjectXMLGenericTransform

# Excel output
GenericTransform.Excel.classname=
           com.ai.generictransforms.ExcelGenericTransform

# Text
GenericTransform.Text.classname=
           com.ai.generictransforms.DebugTextTransform

These definitions are usually included in the master aspire.properties file.

Invoke the URL with a Proper Output Format Parameter

Once the URL is defined, you can see the resulting HTML page by calling the defined URL as follows:

http://yourhost:yourport/your-webapp/servlet/DisplayServlet?url=ihdsTestURL

This will produce an HTML page. Say that we want to call the URL and obtain the data as classic XML; simply add the following additional argument to the above URL:

&aspire_output_format=classic-xml

For Excel data, do something similar:

&aspire_output_format=Excel

The key is to tie down an argument called aspire_output_format to a generic Java classname. It is very easy to write these generic transformations to suit your output needs. The following example shows Excel's generic transform implementation.

Establishing Your Own Output Formats Or Implementing Your Own Generic Data Transform

package com.ai.generictransforms;
import com.ai.htmlgen.*;
import com.ai.common.TransformException;
import Java.io.*;
import com.ai.data.*;
import Javax.servlet.http.*;

public class ExcelGenericTransform 
      extends AHttpGenericTransform 
      implements IFormHandlerTransform
{
    private static String s_separator = "\t";
    protected String getDerivedHeaders(HttpServletRequest request)
    {
        return "Content-Type=application/vnd.ms-excel|Content-Disposition=
                     filename=aspire-hierarchical-dataset.xls";
    }
    public void transform(ihds data, PrintWriter out) 
                    throws TransformException
    {
        staticTransform(data,out);
    }
    public void transform(IFormHandler data, PrintWriter out)
           throws TransformException
    {
        staticTransform((ihds)data,out);
    }
    public static void staticTransform(ihds data, PrintWriter out) 
           throws TransformException
    {
        try
        {
            writeALoop("MainData",data,out,"");
        }
        catch(DataException x)
        {
            throw new TransformException("Error: ExcelGenericTransform: 
			                              Data Exception",x);
        }
    }
    private static void writeALoop(String loopname, 
                                   ihds data, 
                                   PrintWriter out, 
                                   String is)
            throws DataException
    {
        println(out,is, ">> Writing data for loop:" + loopname);

        // write metadata
        IMetaData m = data.getMetaData();
        IIterator columns = m.getIterator();

        StringBuffer colBuffer = new StringBuffer();
        for(columns.moveToFirst();!columns.isAtTheEnd();columns.moveToNext())
        {
            String columnName = (String)columns.getCurrentElement();
            colBuffer.append(columnName).append(s_separator);
        }
        println(out,is,colBuffer.toString());

        //write individual rows
        for(data.moveToFirst();!data.isAtTheEnd();data.moveToNext())
        {
            StringBuffer rowBuffer = new StringBuffer();
            for(columns.moveToFirst();!columns.isAtTheEnd();columns.moveToNext())
            {
                String columnName = (String)columns.getCurrentElement();
                rowBuffer.append(data.getValue(columnName));
                rowBuffer.append(s_separator);
            }
            println(out,is,rowBuffer.toString());

            // recursive call to print children
            IIterator children = data.getChildNames();
            for(children.moveToFirst();!children.isAtTheEnd();children.moveToNext())
            {
                //for each child
                String childName = (String)children.getCurrentElement();
                ihds child = data.getChild(childName);
                writeALoop(childName,child,out,is + "\t");
            }
        }

        println(out,is,">> Writing data for loop:" + loopname + " is complete");
    }
    private static void println(PrintWriter out, String 
          indentationString, String line)
    {
        out.print(indentationString);
        out.print(line);
        out.print("\n");
    }
}

The Implications of Hierarchical Data Sets and Aspire to the Tomcat Development Community

The implications of these facilities are quite exciting to the Tomcat developer community. If page developers use this mechanism to retrieve data, they can put a series of data icons at the top of each page that allow end users to retrieve data in their preferred format. End users will benefit from Excel output as they can now work with data in their spreadsheets. B2B users can retrieve data as XML. Java and other programmers can retrieve the data binding as Java classes and can choose to work with objects, as opposed to XML.

All of the documented facilities are available for Tomcat developers free of cost and in a very small package. For a large number of students and entry-level programmers, this means that they can download a couple of megs of Tomcat and Aspire and sit with a tool like Dreamweaver and be immediately productive with any database of their choice.

As they progress in their learning experience, they can start writing plug-ins and other sophisticated Java programs that can do some specialized work while the basics are supplied by the framework. This ladder-like approach to learning Java, J2EE, XML, and the Enterprise is good.

A Demo URL that Demonstrates the Workings of Hierarchical Data Sets

Access Aspire's sample pages at Indent, Inc., should have a set of pages demonstrating the Hierarchical Data Sets by the time this article is published. You may have to scroll down to see the section that talks about Hierarchical Data Sets, as this URL demonstrates a few other features, as well.

Give Me Your Valuable Feedback

I would be delighted to hear from you if you see any architectural anomalies with Hierarchical Data Sets as well as the potential of Hierarchical Data Sets in programming. You can email me any time at satya@activeintellect.com.

Additional References

Satya Komatineni is the CTO at Indent, Inc. and the author of Aspire, an open source web development RAD tool for J2EE/XML.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.