Modern applications typically require domain searching functionality--the ability to search for data within the context of the application domain. For instance, an application for tracking financial transactions might need to be able to search for all transactions over a certain cash value; a supply-chain management application might need to be able to search for all requisitions for a particular supplier. This capability could be for ad hoc searching or for the generation of reports. Since the underlying information is typically stored in a relational database, at some point the search needs to be converted from the language of the domain to the query language supported by the database (typically SQL). Though this is a variation on the well-known problem of object-relational (O-R) mapping, this particular aspect of the problem has generated less interest than the more fundamental problem of defining the relationship between domain objects and database tables.
As a consequence, I have seen many applications where the benefits of a carefully designed O-R mapping layer have been negated by ill-conceived search functionality that couples the domain objects tightly with the database. The objective of this article is to show how careful design can provide a flexible solution that is easy to maintain and adapt to different O-R mappings or native SQL. The solution I present has two main ingredients: a collection of classes that capture the information to be used for searching; and an implementation of the Visitor pattern that provides the implementation of the search, in the language of the underlying persistence service.
In this article, I use a running example to illustrate the ideas I am presenting. The example is a simple database that is used as part of a backup application to record which files are located in a specific backup volume.
This article is organized as follows: in the following section, some of the forces that influence the design are described. After that an overview of the design is given, followed by a description of the framework used in the design. The two sections after this show two different implementations of the design, and finally, the strengths and weaknesses of the design are considered.
The source code accompanying this article is available in the Resources section below. Note that for brevity I have inlined some variables and methods in this article, compared to the actual source code.
|
Related Reading
|
When considering the problem of domain searching, a number of different forces apply and should be considered as parameters constraining potential solutions.
The first constraint is that the solution should support loose coupling between domain objects and the database. A couple of years ago this would have required no further justification, but many of the exponents of lightweight approaches to application design have argued eloquently that loose coupling is a symptom of over-engineering, so some explanation is necessary.
The objective of loose coupling in this context is to ensure clean separation between the problem domain layer and the data management layer. This separation is critical if each layer is to have well-defined responsibilities. In fact, the emerging trend towards transparent persistence makes this separation all the more important.
Any solution should also be maintainable; modifying or extending domain search functionality should not lead to wholesale changes in the application design. Loose coupling is normally necessary for maintainability, but not sufficient.
Other forces could also apply depending on the specific application, such as performance, substitutability (should a number of different databases or persistence services be supported?), and so on. However, I consider loose coupling and maintainability to be common to all solutions except perhaps the worst kind of hack thrown together on short notice (the kind we all have worked on but never admit to!).
The domain searching design I am going to present assumes a standard layered approach with a presentation layer, a problem domain layer and a data management layer. The example also uses a client layer consisting of Java Server Pages (JSPs) served to a user in a browser. To keep the example simple, I have not used a web application framework such as Struts, though there would be no problem using this design within a Struts application.
The example therefore structures the design as shown in Figure 1. The important part of the design is the third step. This is also the only step that is dependent on the underlying persistence service.

Figure 1: Solution structure
In order to help understand the example, the structure of the database used is shown in Figure 2. A volume contains a number of files, and a number of keywords may also be associated with a volume to allow for keyword-based searching. The information persisted has deliberately been kept as simple as possible to ensure that the clarity of the design is preserved.

Figure 2: Database tables
In order to demonstrate the flexibility of the design, I will present two implementations, one using SQL and one using Hibernate. However, before these implementations can be described, the overall solution framework needs to be outlined.
When designing domain searching, it is important to have a clear specification of what is to be searched, and what is to be presented as the result of the search. This might seem self-evident, but in my experience it can lead to significant discussion, especially if the users of the application fall into different constituencies.
Assuming that such a specification is in place, a number of search criterion classes should be created, which will contain the user-provided search data. In the backup database example, users can search using volume names, file names, a file's last modification date, and keywords. The result will show the volume name, file name, file size, and last modification date for each file matching the submitted search criteria. It is permitted for none of the described search criteria to be used, in which case the entire contents of the database will be returned, subject to whatever constraint is specified by the presentation tier on page size.
This leads to the class diagram shown in Figure 3.

Figure 3: Search criterion classes
ISearchCriterion is an interface implemented by all
search criterion classes. The methods defined by this interface are
described in the table below. For this article, the most important
method is the accept method, which provides the entry
point for visitors. This is a standard implementation of the
Visitor pattern.
| Method | Description |
String getName() |
The name of the criterion. This is used to identify the criterion uniquely within the program. |
String getDisplay() |
The display name of the criterion. This is the name used by clients to present the criterion to users. |
void setValue(String value) throws
ParseException |
This method is used to populate the criterion. The string value is that submitted by the user; it is parsed by this method and the object stores the result of the parse. |
void accept(ISearchVisitor visitor) throws
SearchException |
The entry point for visitor objects, for this criterion. |
For domain searching, the Visitor pattern provides the perfect
means to traverse the search criterion classes. The interface
ISearchVisitor defines methods for visiting each kind
of search criterion. This allows the submitted search criteria to
be traversed and a query built; the search can then be executed and
a result delivered. The visitor hierarchy for this example is shown
in Figure 4.

Figure 4: Visitor classes
The following sections describe the SQL and Hibernate visitors in more detail.
|
The SQL visitor traverses the collection of search criteria
objects and constructs a prepared statement. This prepared
statement is then executed and a search result constructed. The
various visit methods are used to accumulate the tables to include
in the search, and the constraints on the search. The
doSearch method visits all of the search criteria and
combines the information yielded by the visitor methods to create a
prepared statement that is executed. The result set is then
traversed and a search result created. The following section shows
the accumulation of information from the search criteria by visitor
methods. After that, the actual query execution and result handling
is explained.
The visitor methods accumulate information in four instance variables; the prepared statement to be built will have the form:
SELECT <selects> FROM <tables> WHERE <constraints>
The SQL visitor uses collections to accumulate the values used
to populate this statement. The collection used for
<selects> is a list, since the order is
significant; the order provides the basis for interpretation of the
result set. Order is not significant for
<tables>, so here a set is used. In the case of
constraints, in order to be able to match actual values to the
placeholders created in the prepared statement, two lists are used.
The first contains the actual string values to be conjoined in the
prepared statement; the second contains the values to be used to
populate the placeholders.
This leads to the following instance variable declarations:
public class SQLSearchVisitor implements ISearchVisitor {
private List selects = new ArrayList();
private Set tables = new HashSet();
private List criteria = new ArrayList();
private List parameters = new ArrayList();
...
}
Having previously defined what the query should yield, it is
possible to add some values to the collection, such as the names of
the tables and columns, that need to be included for all searches.
This is done by the method addFixedInformation:
private void addFixedInformation() {
selects.add("VOLUME_TB.NAME");
selects.add("FILE_TB.NAME");
selects.add("FILE_TB.SIZE");
selects.add("FILE_TB.LASTMODIFIED");
tables.add("VOLUME_TB");
tables.add("FILE_TB");
criteria.add("FILE_TB.VOLUMEFK = VOLUME_TB.ID");
}
Each visitor method then adds information to the instance
variables. For instance, consider visitVolumeName from
the class SQLSearchVisitor. The visitor method adds
the criterion that any matched volumes must include the submitted
volume name as a substring. visitKeyword provides a
more interesting example as in this case, extra tables have to be
added.
public void visitKeyword(SearchKeyword keyword) {
tables.add("KEYWORD_TB");
tables.add("KEYWORD_VOLUME_REL");
addCriterion("KEYWORD_TB.KEYWORD LIKE ?",
keyword.getKeyword());
}
In this case, it is also necessary to constrain the tables so
that the relationship between the keyword and volume represented by
KEYWORD_VOLUME_REL is used. This is consigned to a
separate method to avoid repetition in the criterion list:
private void addJoins(){
if (tables.contains("KEYWORD_VOLUME_REL")){
// Must also contain KEYWORD_TB and VOLUME_TB
criteria.add("KEYWORD_VOLUME_REL.KEYWORDFK =
KEYWORD_TB.ID");
criteria.add("KEYWORD_VOLUME_REL.VOLUMEFK = VOLUME_TB.ID");
}
}
The remaining visitor methods can be seen in the source code.
These methods are glued together by the method
buildQuery, from SQLSearchVisitor, which
ensures that the instance variables are populated. Having populated
these instance variables, it is possible to create and execute a
prepared statement.
Query construction falls into two phases: creating the SQL text,
and then inserting the parameters into the prepared statement,
corresponding to the placeholders in the text. This is captured by
getPreparedStatement, from
SQLSearchVisitor, which uses the instance variable
conn, which is an instance of
java.sql.Connection.
Creating the SQL statement is just a question of iterating over
the collections and concatenating the values to form a string. A
helper method, addItems from
SQLSearchVisitor, is used to exploit the similarity in
the iterations. For example, if a search is submitted for files
matching the pattern *.jpg, this would result in the
following SQL statement:
SELECT VOLUME_TB.NAME, FILE_TB.NAME, FILE_TB.SIZE,
FILE_TB.LASTMODIFIED FROM FILE_TB, VOLUME_TB WHERE
FILE_TB.VOLUMEFK = VOLUME_TB.ID AND FILE_TB.NAME LIKE ?
And the parameters list would contain the single value
"%.jpg%".
Once the prepared statement has been created, it can be
executed. This yields a ResultSet object that can be
traversed and a SearchResult constructed. This
SearchResult object is returned as the result of the
search to the presentation tier for rendering in a suitable form.
Details can be found in the source code provided.
Hibernate is an object-relational persistence and query service for Java. It supports transparent persistence by allowing persistent data to be defined as plain old Java objects (POJOs). Runtime configuration is then used to map these objects to database tables, giving a clean separation between the domain object model and persistence management. Extensive literature about Hibernate is available elsewhere (see the Resources section), so I don't propose to provide a thorough introduction here.
The domain object model for the example is straightforward and is shown in Figure 5. I will briefly describe the mapping of one domain object to the database in order to give a flavor of how this is done.

Figure 5: Application domain object model
The domain class File represents the information
stored about a backed-up file by the application. It contains a
number of instance variables (listed in the table below) with
associated getter and setter methods.
| Name | Description |
id : Integer |
A unique identifier for the file. |
lastModified : Date |
The date of the last modification of the file prior to backup. |
name : String |
The name of the file. |
size : Integer |
The size of the file in bytes. |
volume : Volume |
The volume used to back up the file. If this File object is
returned as the result of a search, this will be null. |
volumeName : String |
The name of the volume. This is only used if the volume is null. |
Instances of this class correspond to rows in the
FILE_TB table shown in Figure 2; the
VOLUME_FK column in this table is used to provide the
volume and volumeName instance variables
as needed.
The mapping between the File class and the
FILE_TB table is provided by File.hbm.xml.
Similar mapping files for the other domain classes can be defined.
These can be seen in the accompanying source code.
Having established the object-relational mapping, Hibernate offers a number of different methods of querying. It is possible to perform a query using the underlying JDBC connection, in a similar manner to the SQL search visitor described previously. However, this approach bypasses the persistence service and is provided only if the persistence service does not support the desired behavior, which is not the case here. A second method provided by Hibernate is Hibernate Query Language, which allows SQL-style queries to be phrased in the language of domain objects. This is a very powerful approach, and for complex queries is the recommended strategy. The third approach is to use Hibernate criteria, which are objects that constrain properties of the domain object model. For simple searches, such as those needed for the example, this approach is ideal and is therefore the one I have used.
The query to be executed by Hibernate is specified using a
Criteria object. This object is built up by the
visitor as it traverses the supplied search criteria. This object
is therefore implemented as an instance variable by the visitor and
is initialized in the constructor from a supplied Hibernate
Session object:
public class HibernateSearchVisitor implements ISearchVisitor {
private Criteria criteria;
public HibernateSearchVisitor(Session session)
throws SearchException {
this.criteria = session.createCriteria(File.class);
}
...
}
The use of File.class here tells Hibernate that
instances of File are to be yielded by the search.
Each visitor then adds to the criteria object. Dependent domain
objects are tied in by creating child criteria. For example, the
method visitVolumeName from
HibernateSearchVisitor constrains the name of the
dependent Volume object by creating a child criteria
for the volume property of File and constraining
it.
The other visitor methods constrain the properties of
File in a similar manner. The overall search is
performed by the doSearch method from the class
HibernateSearchVisitor, which builds the query using the
buildQuery method. The actual query is performed by
the list method of the criteria object, and the result
is then used to populate a SearchResult object.
Compared to the SQL visitor, the simplicity and elegance of
Hibernate is illustrated perfectly in this example.
In the preceding sections, I have described a design for domain searching that is based around defining domain objects to represent the search, and then using the Visitor pattern to build the actual search to be performed. I have shown two specific implementations of the visitor to illustrate the approach.
The main strengths of the design are:
Note with this last point that this isn't just a question of supporting multiple databases (in my experience, switching databases during a project is rare). It might be that a product is being developed and the flexibility to support other persistence approaches in the future needs to be built in now. Using the visitor design provides this flexibility at very little cost. Moreover, it might be that in the future the application might need to search an external information system, say, using web services. This design makes support for such functionality straightforward to incorporate.
The main weakness of the design is essentially that inherited from the use of the visitor: if the structure of the domain objects changes frequently, then all of the visitor implementations need to be updated to reflect these changes. However, in my experience domain objects change infrequently compared to the pace of change of other parts of an application.
Whatever your approach to persistence, next time you have to implement domain searching, I hope this article has added some weapons to your armory.
Paul Mukherjee works as a technical architect for Systematic Software Engineering Limited in Britain, and is a Sun Certified Enterprise Architect.
Return to ONJava.com.
Copyright © 2007 O'Reilly Media, Inc.