ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

The Hidden Gems of Jakarta Commons, Part 1

by Timothy M. O'Brien

If you are not familiar with the Jakarta Commons, you have likely reinvented a few wheels. Before you write any more generic frameworks or utilities, grok the Commons. It will save you serious time. Too many people write a StringUtils class that duplicates methods available in Commons Lang's StringUtils, or developers unknowingly recreate the utilities in Commons Collections even though commons-collections.jar is already available in the classpath. Seriously, take a break. Check out the Commons Collections API and then go back to your task; I promise you'll find something simple that will save you a week over the next year. If people just took some time to look at Jakarta Commons, we would have much less code duplication--we'd start making good on the real promise of reuse. I've seen it happen; somebody digs into Commons BeanUtils or Commons Collections and invariably they have a "Oh, if I had only known about this, I wouldn't have written 10,000 lines of code" moment. There are still parts of Jakarta Commons that remain a mystery to most; for instance, many have yet to hear of Commons CLI or Commons Configuration, and most have yet to notice the valuable functors package in Commons Collections. In this series, I emphasize some of the less-appreciated tools and utilities in the Jakarta Commons.

In this first part of the series, I explore XML rule set definitions in the Commons Digester, functors available in Commons Collections, and an interesting application, Commons JXPath, to query a List of objects. Jakarta Commons contains utilities that aim to help you solve problems at the lowest level of programming: iterating over collections, parsing XML, and selecting objects from a List. I would encourage you to spend some time focusing on these small utilities, as learning about the Jakarta Commons will save you a substantial amount of time. It isn't simply about using Commons Digester to parse XML or using CollectionUtils to filter a collection with a Predicate. You will start to see benefits once you realize how to combine the power of these utilities and how to relate Commons projects to your own applications; once this happens, you will come to see commons-lang.jar, commons-beanutils.jar, and commons-digester.jar as just as indispensable to any system as the JVM itself.

Related Reading

Jakarta Commons Cookbook
By Timothy M. O'Brien

If you are interested in learning more about the Jakarta Commons, check out the Jakarta Commons Cookbook. This book is full of recipes that will get you hooked on the Commons, and tells you how to use Jakarta Commons in concert with other small open source components such as Velocity, FreeMarker, Lucene, and Jakarta Slide. In this book, I introduce a wide array of tools from Jakarta Commons from using simple utilities in Commons Lang to combining Commons Digester, Commons Collections, and Jakarta Lucene to search the works of William Shakespeare. I hope this series and the Jakarta Commons Cookbook provide you with some interesting solutions for low-level programming problems.

1. XML-Based Rule Sets for Commons Digester

Commons Digester 1.6 provides one of the easiest ways to turn XML into objects. Digester has already been introduced on the O'Reilly network in two articles: "Learning and Using Jakarta Digester," by Philipp K. Janert, and "Using the Jakarta Commons, Part 2," by Vikram Goyal. Both articles demonstrate the use of XML rule sets, but this idea of defining rule sets in XML has not caught on. Most sightings of the Digester appear to define rule sets programmatically, in compiled code. You should avoid hard-coding Digester rule sets in compiled Java code when you have the opportunity to store such mapping information in an external file or a classpath resource. Externalizing a Digester rule set makes it easier to adapt to an evolving XML document structure or an evolving object model.

To demonstrate the difference between defining rule sets in XML and defining rule sets in compiled code, consider a system to parse XML to a Person bean with three properties--id, name, and age, as defined in the following class:

package org.test;

public class Person {
  public String id;
  public String name;
  public int age;
  public Person() {}

  public String getId() { return id; }
  public void setId(String id) { 
    this.id = id;

  public String getName() { return name; }
  public void setName(String name) {
    this.name = name;

  public int getAge() { return age; }
  public void setAge(int age) {
    this.age = age;

Assume that your application needs to parse an XML file containing multiple person elements. The following XML file, data.xml, contains two person elements that you would like to parse into Person objects:

  <person id="1">
    <name>Tom Higgins</name>
  <person id="2">
    <name>Barney Smith</name>
  <person id="3">
    <name>Susan Shields</name>

You expect the structure and content of this XML file to change over the next few months, and you would prefer not to hard-code the structure of the XML document in compiled Java code. To do this, you need to define Digester rules in an XML file that is loaded as a resource from the classpath. The following XML document, person-rules.xml, maps the person element to the Person bean:

  <pattern value="people/person">
    <object-create-rule classname="org.test.Person"/>
    <set-next-rule methodname="add" 
    <bean-property-setter-rule pattern="name"/>
    <bean-property-setter-rule pattern="age"/>

All this does is instruct the Digester to create a new instance of Person every time it encounters a person element, call add() to add this Person to an ArrayList, set any bean properties that match attributes on the person element, and set the name and age properties from the sub-elements name and age. You've seen the Person class, the XML document to be parsed, and the Digester rule definitions in XML form. Now you need to create an instance of Digester with the rules defined in person-rules.xml. The following code creates a Digester by passing the URL of the person-rules.xml resource to the DigesterLoader. Since the person-rules.xml file is a classpath resource in the same package as the class parsing the XML, the URL is obtained with a call to getClass().getResource(). The DigesterLoader then parses the rule definitions and adds these rules to the newly created Digester:

import org.apache.commons.digester.Digester;
import org.apache.commons.digester.xmlrules.DigesterLoader;

// Configure Digester from XML ruleset
URL rules = getClass().getResource("./person-rules.xml");
Digester digester = 

// Push empty List onto Digester's Stack
List people = new ArrayList();
digester.push( people );

// Parse the XML document
InputStream input = new FileInputStream( "data.xml" );
digester.parse( input );

Once the Digester has parsed the XML in data.xml, three Person objects should be in the people ArrayList.

The alternative to defining Digester rules in XML is to add them using the convenience methods on a Digester instance. Most articles and examples start with this method, adding rules using the addObjectCreate() and addBeanPropertySetter() methods on Digester. The following code adds the same rules that were defined in person-rules.xml:


If you have ever found yourself working at an organization with 2500-line classes to parse a huge XML document with SAX, or a whole collection of classes to work with DOM or JDOM, you understand that XML parsing is more complex than it needs to be, in the majority of cases. If you are building a highly efficient system with strict speed and memory requirements, you need the speed of a SAX parser. If you need the complexity of the DOM Level 3, use a parser like Apache Xerces. But if you are simply trying to parse a few XML documents into objects, take a look at Commons Digester, and define your rule set in an XML file.

Any time you can move this type of configuration outside of compiled code, you should. I would encourage you to define your digester rules in an XML file loaded either from the file system or the classpath. Doing so will make it easier to adapt your program to changes in the XML document and changes in your object model. For more information on defining Digester rules in an XML file, see Section 6.2 of the Jakarta Commons Cookbook, "Turning XML Documents into Objects."

Pages: 1, 2

Next Pagearrow