ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

The Hidden Gems of Jakarta Commons, Part 1
Pages: 1, 2

2. Functors in Commons Collections

Functors are an interesting part of Commons Collections 3.1 for two reasons: they haven't received the attention they warrant, and they have the potential to change the way you approach programming. Functor is just a fancy name for an object that encapsulates a function--a "functional object." And while they are certainly not the same thing, if you have ever used method pointers in C or C++, you'll understand the power of functors. A functor is an object--a Predicate, a Closure, or a Transformer. Predicates evaluate objects and return a boolean, Transformers evaluate objects and return new objects, and Closures accept objects and execute code. Functors can be combined into composite functors that model loops, logical expressions, and control structures, and functors can also be used to filter and operate upon items in a collection.



Explaining functors in an article as short as this may be impossible, so to "jump start" your introduction to functors, I will solve the same problem both with and without functors. In this example, Student objects from an ArrayList are sorted into two List instances if they meet certain criteria; students with straight-A grades are added to an honorRollStudents list, and students with Ds and Fs are added to a problemStudents list. After the students are separated, the system will iterate through each list, giving the honor-roll students an award and scheduling a meeting with parents of problem students. The following code implements this process without the use of functors:

List allStudents = getAllStudents();

// Create 2 ArrayLists to hold honorRoll students
// and problem students
List honorRollStudents = new ArrayList();
List problemStudents = new ArrayList();

// Iterate through all students.  Put the
// honorRoll students in one List and the
// problem students in another.
Iterator allStudentsIter = allStudents.iterator();
while( allStudentsIter.hasNext() ) {
  Student s = (Student) allStudentsIter.next();

  if( s.getGrade().equals( "A" ) ) {
    honorRollStudents.add( s );
  } else if( s.getGrade().equals( "B" ) && 
             s.getAttendance() == PERFECT) {
    honorRollStudents.add( s );
  } else if( s.getGrade().equals( "D" ) || 
             s.getGrade().equals( "F" ) ) {
    problemStudents.add( s );
  } else if( s.getStatus() == SUSPENDED ) {
    problemStudents.add( s );
  }
}

// For all honorRoll students, add an award and
// save to the Database.
Iterator honorRollIter = 
    honorRollStudents.iterator();
while( honorRollIter.hasNext() ) {
  Student s = (Student) honorRollIter.next();
   
  // Add an award to student record
  s.addAward( "honor roll", 2005 );
  Database.saveStudent( s );
}

// For all problem students, add a note and 
// save to the database.
Iterator problemIter = problemStudents.iterator();
while( problemIter.hasNext() ) {
  Student s = (Student) problemIter.next();

  // Flag student for special attention
  s.addNote( "talk to student", 2005 );
  s.addNote( "meeting with parents", 2005 );
  Database.saveStudent( s );
}

The previous example is very procedural; the only way to figure out what happens to a Student object is to step through each line of code. The first half of this example is decision logic that applies tests to each Student object and classifies students based on performance and attendance. The second half of this example operates on the Student objects and saves the result to the database. A 50-line method body like the previous example is how most systems begin--manageable procedural complexity. But problems start to appear when the requirements start to shift. As soon as that decision logic changes, you will need to start adding more clauses to the logical expressions in the first half of the previous example. For example, what happens to your logical expression if a student is classified as a problem if he has a B and perfect attendance, but attended detention more than five times? Or what happens to the second half, when a student can be on the honor roll only if they were not a problem last year? When exceptions and requirement changes start to affect procedural code, manageable complexity turns into unmaintainable spaghetti code.

Step back from the previous example and consider what that code was doing. It was looking at every object in a List, applying a criteria, and, if that criteria was satisfied, acting upon an object. A critical improvement that could be made to the previous example is the decoupling of the criteria from the code that acts upon an object. The following two code excerpts solve the previous problem in a very different way. First, the criteria for the honor roll and problem students are modeled by two Predicate objects, and the code that acts upon honor roll and problem students is modeled by two Closure objects. These four objects are defined below:

import org.apache.commons.collections.Closure;
import org.apache.commons.collections.Predicate;

// Anonymous Predicate that decides if a student 
// has made the honor roll.
Predicate isHonorRoll = new Predicate() {
  public boolean evaluate(Object object) {
    Student s = (Student) object;

    return( ( s.getGrade().equals( "A" ) ) ||
            ( s.getGrade().equals( "B" ) && 
              s.getAttendance() == PERFECT ) );
  }
};

// Anonymous Predicate that decides if a student
// has a problem.
Predicate isProblem = new Predicate() {
  public boolean evaluate(Object object) {
    Student s = (Student) object;

    return ( ( s.getGrade().equals( "D" ) || 
               s.getGrade().equals( "F" ) ) ||
             s.getStatus() == SUSPENDED );
  }
};

// Anonymous Closure that adds a student to the 
// honor roll
Closure addToHonorRoll = new Closure() {
  public void execute(Object object) {
    Student s = (Student) object;
      
    // Add an award to student record
    s.addAward( "honor roll", 2005 );
    Database.saveStudent( s );
  }
};

// Anonymous Closure flags a student for attention
Closure flagForAttention = new Closure() {
  public void execute(Object object) {
    Student s = (Student) object;
      
    // Flag student for special attention
    s.addNote( "talk to student", 2005 );
    s.addNote( "meeting with parents", 2005 );
    Database.saveStudent( s );
  }
};

The four anonymous implementations of Predicate and Closure are separated from the system as a whole. flagForAttention has no knowledge of what the criteria are for a problem student, and the isProblem Predicate only knows how to identify a problem student. What is needed is a way to marry the right Predicate with the right Closure, and this is shown in the following example.

import org.apache.commons.collections.ClosureUtils;
import org.apache.commons.collections.CollectionUtils;
import org.apache.commons.collections.functors.NOPClosure;

Map predicateMap = new HashMap();

predicateMap.put( isHonorRoll, addToHonorRoll );
predicateMap.put( isProblem, flagForAttention );
predicateMap.put( null, ClosureUtils.nopClosure() );

Closure processStudents = 
    ClosureUtils.switchClosure( predicateMap );

CollectionUtils.forAllDo( allStudents, processStudents );

In the previous code, the predicateMap matches Predicates to Closures; if a Student satisfies the Predicate in the key, it will be passed to the Closure in the value. By supplying a NOPClosure value and a null key, we will pass Student objects that satisfy neither Predicate to a "do nothing" or "no operation" NOPClosure created by a call to ClosureUtils. A SwitchClosure, processStudents, is created from the predicateMap, and the processStudents Closure is applied to every Student object in the allStudents using CollectionUtils.forAllDo(). This is a very different approach; notice that you are not iterating through any lists. Instead, you set rules and consequences and CollectionUtils and SwitchClosure take care of the execution.

When you separate criteria using Predicates and actions using Closures, your code is less procedural and much easier to test. The isHonorRoll Predicate can be unit tested in isolation from the addToHonorRoll Closure, and both can be tested by supplying a mock instance of the Student class. The second example also demonstrates CollectionUtils.forAllDo(), which applies a Closure to every element in a Collection. You may have noticed that using functors did not reduce the line count; in fact, the use of functors increased the line count. But the real benefit from functors is the modularity and encapsulation of criteria and actions. If your method length tends towards hundreds of lines, consider an less procedural, more object-oriented approach--use a functor.

Chapter 4, "Functors," in the Jakarta Commons Cookbook introduces functors available in Commons Collections, and Chapter 5, "Collections," shows you how to use functors with the Java Collections API. All of the functors--Closure, Predicate, and Transformer--can be combined into composite functors that can be used to model any kind of logic. switch, while, and for structures can be modeled with SwitchClosure, WhileClosure, and ForClosure. Compound logical expressions can be constructed from multiple Predicates using OrPredicate, AndPredicate, AllPredicate, and NonePredicate, among others. Commons BeanUtils also contains functor implementations that are used to apply functors to bean properties--BeanPredicate, BeanComparator, and BeanPropertyValueChangeClosure. Functors are a different way of thinking about low-level application architecture, and they could very well change your approach to coding.

3. Using XPath Syntax to Query Objects and Collections

Commons JXPath is a surprising (non-standard) use of an XML standard. XPath has been around for some time as a way to select a node or node set in an XSL style sheet. If you've worked with XML, you are probably familiar with the syntax /foo/bar that selects the bar sub-elements of the foo document element. Jakarta Commons JXPath adds an interesting twist: you can use JXPath to select objects from beans and collections, among other object types such as servlet contexts and DOM Document objects. Consider a List of Person objects. Each Person object has a bean property of the type Job, and each Job object has a salary property of the type int. Person objects also have a country property, which is a two-letter country code. Using JXPath, it is easy to select all Person objects with a US country and a Job that pays more than one million dollars. Here is some code to set up a List of beans to filter with JXPath:

// Person's constructor sets firstName and country
Person person1 = new Person( "Tim", "US" );
Person person2 = new Person( "John", "US" );
Person person3 = new Person( "Al",  "US" );
Person person4 = new Person( "Tony", "GB" );

// Job's constructor sets name and salary
person1.setJob( new Job( "Developer", 40000 ) );
person2.setJob( new Job( "Senator", 150000 ) );
person3.setJob( new Job( "Comedian", 3400302 ) );
person4.setJob( new Job( "Minister", 2000000 ) );

Person[] personArr = 
  new Person[] { person1, person2, 
                 person3, person4 };

List people = Arrays.asList( personArr );

The people List contains four Person beans: Tim, John, Al, and George. Tim is a developer who makes $40,000, John is a Senator who makes $150,000, Al is a comedian who walks home with $3.4 million, and Tony is a prime minister who makes 2 million euros. Our task is simple: iterate over this List and print the name of every Person who is a U.S. citizen making over one million dollars. Assume that people is an ArrayList of Person objects, and take a look at the solution without the benefit of JXPath:

Iterator peopleIter = people.getIterator();
while( peopleIter.hasNext() ) {
  Person person = (Person) peopleIter.next();

  if( person.getCountry() != null &&
      person.getCountry().equals( "US" ) &&
      person.getJob() != null &&
      person.getJob().getSalary() > 1000000 ) {
        print( person.getFirstName() + " "
               person.getLastName() );
      }
    }
  }
}

The previous example is heavy, and somewhat error-prone. To find the matching Person objects, you first need to iterate over each Person and test the country property of each. If the country property is not null and it has the correct value, then you must test the job property to find out if it is non-null and has salary property greater than 1000000. The line count of the previous example can be dramatically reduced with Java 1.5's for syntax, but, even with Java 1.5, you still need to perform two comparisons at two different levels.

What if you had to write a number of these queries against a set of Person objects stored in memory? What if your application had to display all of the Person objects in England named Tony? Or, what if you had to print the name of every Job with a salary less than 20,000? If you were storing these objects in a relational database, you could solve this by writing a SQL query, but if you are dealing with objects in memory, you don't have this luxury. While XPath was primarily meant for XML, you could use it to write "queries" against a collection of objects, treating objects as elements and bean properties as sub-elements. Yes, this is a strange application of XPath, but take a look at how the following example performs three different queries against people, an ArrayList of Person objects.

import org.apache.commons.jxpath.JXPathContext;

public List queryCollection(String xpath,
                            Collection col) {
    List results = new ArrayList();

    JXPathContext context = 
        JXPathContext.newContext( col );
 
    Iterator matching = 
        context.iterate( xpath );

    while( matching.hasNext() ) {
        results.add( matching.getNext() );
    }
    return results;
}

String query1 =
   ".[@country = 'US']/job[@salary > 1000000]/..";  
String query2 =
   ".[@country = 'GB' and @name = 'Tony']";  
String query3 = 
   "./job/name";

List richUsPeople = 
    queryCollection( query1, people );
List britishTony = 
    queryCollection( query2, people );
List jobNames = 
    queryCollection( query3, people );

The method queryCollection() takes an XPath expression and applies it to a Collection. XPath expressions are evaluated against a JXPathContext, which is created by calling JXPathContext.newContext() and passing in the Collection to be queried. Calling context.iterate() then applies the XPath expression to each item in the Collection, returning an Iterator with every matching "node" (or in this case, "object"). The first query performed by the previous example, query1, is same query from the original example implemented without JXPath. query2 selects all Person objects with a country property of GB and a name property of Tony, and query3 selects a List of String objects, the name property of all of the Job objects.

When I first saw Commons JXPath, it struck me as a bad idea. Why apply XPath expressions to objects? Something about it didn't feel right. But this unexpected use of XPath as a query language for a collection of beans has come in handy for me more than a few times in the past few years. If you find yourself looping through lists to find matching elements, consider using JXPath. For more information, see Chapter 12, "Searching and Filter," of Jakarta Commons Cookbook, which discusses Commons JXPath and Jakarta Lucene paired with Commons Digester.

And There's More

Stay tuned to this exploration of the far reaches of the Jakarta Commons. In the next part of this series, I'll introduce some related tools and utilities. Set operations in Commons Collections, using Predicate objects with collections, configuring an application with Commons Configuration, and using Commons Betwixt to read and write XML. There is much to be gained from the Jakarta Commons that cannot be conveyed in a few thousand words, and I would encourage you to take a look at the Jakarta Commons Cookbook. Many of these utilities may, at first glance, seem somewhat trivial, but the power of Jakarta Commons lies in how these tools can be combined with each other and integrated into your own systems.

Timothy M. O'Brien is a developer and entrepreneur living in Chicago, IL. He spends his days programming in Java, Python, and Ruby.


Return to ONJava.com.