ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Using Hierarchical Data Sets with Aspire and Tomcat

by Satya Komatineni
03/05/2003

What are Hierarchical Data Sets and Why Do You Care?

Hierarchical Data Sets are not new. They already exist in the form of CICS transactional data, files in directories, and plain Java objects, as well as the obvious XML. In the XML Journal in early 2001, I floated the idea that programmers can benefit from hierarchical data abstractions even though many of their data sources are predominantly relational (such as databases including MySQL, Oracle, SQL Server, DB2, etc.).

The .NET world has a similar idea taking root in the notion of "datasets." Although there are important differences between my proposed Hierarchical Data Sets and the nature of Microsoft's datasets, it is evident that Hierarchical Data Sets enhance relational abstractions with richer detail.

This article examines the structure of, and a Java API for, Hierarchical Data Sets. Unlike the XML Journal reference two years ago, you will now actually have a piece of executable code to use to start taking advantage of Hierarchical Data Sets. Although programmers can code in Java to access various data sources and construct the final Hierarchical Data Set, this article has an implementation that you can readily use to construct these Hierarchical Data Sets declaratively by simply composing pre-built relational adapters. Relational adapters include file readers, SQL readers, Stored Procedure readers, et cetera.

Related Reading

Java & XML Data Binding
By Brett McLaughlin

The question you're probably asking is "What good are these Hierarchical Data Sets?" Although they can't rival the salutary effects of large expensive pieces of Carbon on your most certainly deserving companions, Hierarchical Data Sets are quite useful in the programming world. For starters, an entire HTML page worth of data can be satisfied by a single Hierarchical Data Set. In an MVC model, a controller servlet can deliver a Hierarchical Data Set to a JSP page, which will paint it without further ado. For a warmup, it can be converted to XML and directly returned to the caller by the controller servlet. For the appeal, the Hierarchical Data Set can be converted to Excel. For the stylish, the Hierarchical Data Set can be redirected to a reporting engine or a charting engine that supports XML data.

Although the primary focus of the article is the Java programming API for Java programmers, Hierarchical Data Sets can be used by non-Java programmers quite effectively to obtain XML, HTML, or Excel formats directly from relational databases and other data sources by using a J2EE server such as Tomcat. Without further ado, let us investigate the structure of Hierarchical Data Sets and see how these data sets can be obtained declaratively (while relaxing your programming muscles a bit).

Structure of Hierarchical Data

A Hierarchical Data Structure can be conceptually represented as a Java API, or XML, or some other format. It is easiest to visualize as XML.

<AspireDataSet>
    <!-- A set of key value pairs at the root level -->
    <key1>val1</key1>   
    <key2>val2</key2>

    <!-- A set of named loops -->
    <loop name="loop">
    </loop>
    <loop name="loop2">
    </loop>
</AspireDataSet>

This is a set of key/value pairs. A given set of key/value pairs could yield n independent loops. Each loop is essentially a table of data. The term "loop" is synonymous with "table." I haven't used "table" because people might literally take "table" to mean only data from a relational table. Having mentioned that is a collection of rows (RowSet!), let us look closer at the structure of a loop:

<loop name="loopname">
    <row>
        <!-- a set of key value pairs -->
        <key1>val1</key1>
        <key2>val2</key2>

        <!-- a set of named loops -->
        <loop name="loopname1">
        </loop>

        <!-- a set of named loops -->
        <loop name="loopname2">
        </loop>
    </row>
    <row>
    </row>
</loop>

The only odd thing here is the structure of a row. A row is, expectedly, a collection of key/value pairs. Here a row includes not only key/value pairs, but also another recursive set of n number of independent loops. This extension can produce trees with any amount of depth. (Or should I say, height!)

Structure of Hierarchical Data in Java

The moment I showed the hierarchical data as XML, there is a possibility that people might take a Hierarchical Data Set to be literally XML and, hence, literally DOM and, hence, a lot of memory inside of the JVM. No need to panic. The Hierarchical Data Set can have its own Java API and need not be represented as a DOM. The majority of the time it is a forward-only-traversing-cursor-like-lazy-loading tree. Here is a working Java API for a Hierarchical Data Set:

package com.ai.htmlgen;
import com.ai.data.*;

/**
 * Represents a Hierarchical Data Set.
 * An hds is a collection of rows.
 * You can step through the rows using ILoopForwardIterator
 * You can find out about the columns via IMetaData.
 * An hds is also a collection loops originated using the current row.
 */
public interface ihds extends ILoopForwardIterator
{
    /**
     * Returns the parent if available
     * Returns null if there is no parent
     */
    public ihds getParent() throws DataException;

    /**
     * For the current row return a set of 
     * child loop names. ILoopForwardIteraor determines
     * what the current row is.
     * 
     * @see ILoopForwardIterator
     */
    public IIterator getChildNames() throws DataException;

    /**
     * Given a child name return the child Java object 
     * represented by ihds again
     */
    public ihds getChild(String childName) throws DataException;

    /**
     * returns a column that is similar to SUM, AVG etc of a 
     * set of rows that are children to this row.
     */
    public String getAggregateValue(String keyname) throws DataException;

    /**
     * Returns the column names of this loop or table.
     * @see IMetaData
     */
    public IMetaData getMetaData() throws DataException;

    /**
     * Releases any resources that may be held by this loop of data 
     * or table.
     */
    public void close() throws DataException;
}

For brevity, the Java interface ihds represents "Interface to Hierarchical Data Set." This API allows you to step through your loops recursively. An implementation has the option to load the loops only when they are requested. It can also assume either forward-only or random traversal. Before going further, let me present the two additional interfaces that this API uses: ILoopForwardIterator and IMetaData.

How to Move Through a Series of Rows in HDS: ILoopForwardIterator

package com.ai.htmlgen;
import com.ai.data.*;

public interface ILoopForwardIterator
{
   /**
    * getValue from the current row matching the key
    */
   public String getValue(final String key);

   public void moveToFirst() throws DataException;

   public void moveToNext() throws DataException;

   public boolean isAtTheEnd() throws DataException;
}

IMetaData: For Reading Column Names

package com.ai.data;
public interface IMetaData 
{
	public IIterator getIterator();

	public int       getColumnCount();

	public int       getIndex(final String attributeName)
		throws FieldNameNotFoundException;
}

How Can You Obtain a Hierarchical Data Set, So You Can Use It?

Now that we know the structure of Hierarchical Data Set, how do you get hold of one? As I stated earlier, this is easy under Aspire. The steps are as follows:

  1. Learn the basics of Aspire.
  2. Create a definition file for your Hierarchical Data Set.
  3. Call your definition and receive ihds in your Java code.

Each of these steps is explained in some detail below.

Read the Basics on the Usage of the Aspire JAR

Aspire is a small JAR file that can complement your Java programming, particularly when used with an app server such as Tomcat. At the heart of Aspire is a set of configuration files, where you declare your data access mechanisms in terms of Java classes and arguments to those Java classes. Aspire will execute those Java classes and return the resulting objects. Hierarchical Data Sets are no exception.

An earlier O'Reilly article introduced Aspire: "For Tomcat Developers, Aspire Comes in a JAR." This will familiarize you with defining databases and calling SQL and Stored Procedures, as well as configuring and initializing Aspire.

Create a Definition File For your Hierarchical Data Set

A sample definition for a Hierarchical Data Set is as follows:

###################################
# ihdsTest data definition: section1
###################################
request.ihdsTest.className=com.ai.htmlgen.DBHashTableFormHandler1
request.ihdsTest.loopNames=works

#section2
request.ihdsTest.works.class_request.className=com.ai.htmlgen.GenericTableHandler6
request.ihdsTest.works.loopNames=childloop1
request.ihdsTest.works.query_request.className=com.ai.data.RowFileReader
request.ihdsTest.works.query_request.filename=aspire:\\samples
            \\pop-table-tags\\properties\\pop-table.data

#section3
request.childloop1.class_request.classname=com.ai.htmlgen.GenericTableHandler6
request.childloop1.query_request.classname=com.ai.data.RowFileReader
request.childloop1.query_request.filename=aspire:\\samples\\pop-table-tags
           \\properties\\pop-table.data

This definition has three sections. The data set is named ihdsTest. The first section tells Aspire that the Java class com.ai.htmlgen.DBHashTableFormHandler1 is responsible for returning an object implementing ihds. Unless you code your own implementation of ihds, you will use this class in every data set definition. It's the pre-fabricated class that knows how to compose relational assets into hierarchical assets. Line 2 of section 1 tells DBHashTableFormHandler1 that this main data set has one loop called works.

Section2 defines the loop works. A loop structure in Aspire uses two Java classes: a class request (GenericTableHandler6) and a Query request (RowFileReader). RowFileReader reads a set of records from a flat file and makes them look like a collection of rows and columns. GenericTableHandler6 takes this collection and applies such features as aggregate values and row numbers and implements the ihds interface at the loop level. As with DBHashtableFormHandler1, GenericTableHandler6 is present in most definitions. RowFileReader might change, depending on your data sources. For example, the following parts exist in this category:

  1. RowFileReader.
  2. DBRequestExecutor2 (for reading SQL).
  3. StoredProcedureExecutor2 (for reading from Stored Procedures).
  4. XMLReader (for reading XML files).
  5. Or, you can write your own reader that implements IDataCollection.

Section2 also indicates that it has a child called childloop1. GenericTableHandler6 will take this cue and look for section3, identified by childloop1.

Section3 defines childloop1. The definition is identical to section2, except that childloop1 has no children. Both section2 and section3 use RowFileReaders. In practice, they can use any combination of data reader parts.

Let me call this file ihds-test.properties. Include this file in Aspire's master aspire.properties as follows:

application.includeFiles=aspire:\\samples\\hello-world
             \\properties\\hello-world.properties,\
aspire:\\samples\\ihds-test\\ihds-test.properties,\
aspire:\\samples\\xml-reader\\xml-reader.properties

For the sake of completeness, I have included a couple of lines above and below that inclusion process.

Call your Definition and Receive an ihds

Now that we have the definition, how do we call it from Java? Reading that first article will help considerably, but here is the Java code:

Hashtable args = new Hashtable();
args.put("key1".toLowerCase(), "value1");

IFactory factory = AppObjects.getFactory();
ihds hds         = (ihds)factory.getObject("ihdsTest",args);

// use ihds

Aspire has a factory service, represented by the IFactory interface. This factory interface allows you to call a Java class, identified by a symbolic name called ihdsTest, with any arguments passed in as a hashtable. The arguments are expected to be lowercase strings for the downstream relational adapters.

Pages: 1, 2

Next Pagearrow