| Sign In/My Account | View Cart |
Querying a collection in Java is a frequently executed task in most applications. If you find that a particular query is a bottleneck in your application, how can you optimize that query to make it faster? In this article, I'll provide an example of how you might go about speeding up this common type of congestion. We'll run through a tuning exercise to improve the performance of a query on a list, using several standard tuning techniques discussed in my book, Java Performance Tuning.
And the query might be "how many strings in the list contain the substrings 'od' or 'lo.'" (The answer to this particular query would be '3' for this example list.)
For my actual collection, I'll generate multicharacter strings using the lowercase characters of the Latin alphabet ('a' to 'z'). For example, a collection of all four-character strings generated in this way would produce a collection of 26 x 26 x 26 x 26 = 456976 four-character strings. I'll simply query this collection for the count of strings that contain any of the substrings "ie" or "xy" or "pq." I've elected to use a Vector object to hold the collection for the start of the tests.
I've used an easily generated collection for the data, and also a straightforward query, to avoid any application-specific distractions. I want to focus on the tuning here. The query is representative of many types of queries I've seen in applications.
The simple, straightforward version of the query is:
int count = 0;
for(int i = 0; i < collection.size(); i++)
{
if( ( ((String) collection.get(i)).indexOf("ie") != -1 )
| ( ((String) collection.get(i)).indexOf("xy") != -1 )
| ( ((String) collection.get(i)).indexOf("pq") != -1 ) )
count++;
}
return count;
Several items immediately leap out at me. The first is the inappropriate use of the normal boolean-OR operator, (|), rather than the short-circuit boolean-OR operator, (||); the second is the unnecessarily repeated method call in the loop test, (collection.size()); and the third is the repeated String cast in the query. All of these are standard targets for optimization in loops (see Chapter 7, "Loops," in Java Performance Tuning).
However, just because these are standard optimizations, doesn't mean we should apply them all immediately without testing their effects. So, let's test them. For my tests, I'll use the Sun VMs from the Java SDK 1.2 and 1.3. In addition, I'll also test with the HotSpot VMs delivered with these two SDKs--HotSpot versions 1.0 and 2.0. I also like to test one non-JIT VM, which in this case will be the 1.2 VM with the JIT turned off. This is easily done by setting the "java.compiler" property to "NONE," (using java "-Djava.compiler=NONE" ..., for example).
|
Related Reading
|
First, we'll replace the boolean-OR operator. This simple optimization replaces | with ||. For example:
if( ( ((String) collection.get(i)).indexOf("ie") != -1 )
| ( ((String) collection.get(i)).indexOf("xy") != -1 )
| ( ((String) collection.get(i)).indexOf("pq") != -1 ) )
becomes
if( ( ((String) collection.get(i)).indexOf("ie") != -1 )
|| ( ((String) collection.get(i)).indexOf("xy") != -1 )
|| ( ((String) collection.get(i)).indexOf("pq") != -1 ) )
The short-circuit booleans speed up the test slightly in almost every case. (See test2 under Table and Listings in this article.)
for(int i = 0; i < collection.size(); i++)
with
int max = collection.size();
for(int i = 0; i < max; i++)
Amazingly, while most of the VMs gain one or two percent in speed from this change, eliminating the repeated method call actually makes the 1.2 JIT VM run 40 percent slower! See test3 under Table and Listings. (While I can't imagine what makes the change so disadvantaged, I suspect it's some inefficient artifact of the native-code generation on my processor, such as an incorrect integer alignment. This likelihood is confirmed by a further test, which combines the two optimizations used so far (see test4 under Table and Listings).
This test gives the best speed so far for most of the VMs. It also registers only a slight overall decrease in performance for 1.2 VM, even though the short-circuit operator optimization could not have directly eliminated the 40-percent decrease in performance from the size() call replacement.
for(int i = 0; i < max; i++)
{
if( ( ((String) collection.get(i)).indexOf("ie") != -1 )
|| ( ((String) collection.get(i)).indexOf("xy") != -1 )
|| ( ((String) collection.get(i)).indexOf("pq") != -1 ) )
becomes
String s;
for(int i = 0; i < max; i++)
{
if( ( (s = (String) collection.get(i)).indexOf("ie") != -1 )
|| ( s.indexOf("xy") != -1 )
|| ( s.indexOf("pq") != -1 ) )
All the VMs have a significant increase in speed with this improvement. I've included the results of testing with all the optimizations together in test5 and also, without the size() call elimination, in test6 in the Table and Listings.
Test5 shows all the VMs are now significantly faster than their starting test measurments. Only the 1.2 VM records a faster time for test6.
The results of testing the optimizations so far, together with the change to an ArrayList collection object are listed in test7 under Table and Listings. I've also tested without the size() call elimination in test8. Test7 is now the fastest for all VMs, including the 1.2 JIT VM. For the 1.2 VM, the accumulated changes could have resulted in avoiding the inefficient native-code generation problem.
For collection queries, avoiding method accessors is often simple to achieve by implementing the query in the collection class. For the example here we could implement our own java.util.List class, and implement the query in that class, thus gaining access to the internal collection. There is, however, a quicker method. Because Vector is implemented with its internal element collection defined as protected, we can subclass Vector to gain access to the internal element collection, and implement the query in our subclass, like this:
class TestList
extends Vector
{
public int customQuery()
{
int count = 0;
String s;
for(int i = 0; i < elementCount; i++)
{
if( ( (s = (String) elementData[i]).indexOf("ie") != -1 )
|| ( s.indexOf("xy") != -1 )
|| ( s.indexOf("pq") != -1 ) )
count++;
}
return count;
}
}
The equivalent of the original test is now just:
return collection.customQuery();
The results of this test are shown in test9 of Table and Listings. All the VMs show that this is now the fastest test (except the first run of HotSpot 1.0).
Interestingly, the 1.2 VM is the VM that could be made to go the fastest after all the manual optimizations were applied. (test9 shows the 1.2 VM running at under 40 percent, while no other VM achieved even the 45-percent mark. All timings are on the same absolute scale.) I have found that, in general, when no optimizations have been made to an application, optimizing VMs (like the 1.3 VM and the HotSpot VMs) frequently provide better performance.
It is much easier and more cost effective to simply use an optimizing VM, rather than to spend time optimizing. But where extensive manual optimizations can be applied, a plain JIT VM can eventually be made to run faster than that of optimizing VMs. This is actually not surprising, since no optimizing VM is likely to be capable of optimizing as intelligently as a person.
Jack Shirazi is an independent consultant. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance. Before using Java, Jack spent many years tuning Smalltalk applications. Jack's early career involved research in theoretical physics and bioinformatics. Jack has publications in the field of protein structure and is proud to have contributed to some of the core Perl5 modules.
O'Reilly & Associates will soon release (September 2000) Java Performance Tuning.