The Performance of Java's Lists
Pages: 1, 2, 3
The LinkedList implementation
LinkedList is implemented using a list of
doubly-linked nodes. To access elements by index you need to traverse
all the nodes until the indexed node is reached:
public Object get(int index) {
//check the index is valid first .. code not shown here
Entry e = header; //starting node
//go forwards or backwards depending on which is closer
if (index < size/2) {
for (int i = 0; i <= index; i++)
e = e.next;
} else {
for (int i = size; i > index; i--)
e = e.previous;
}
return e;
}
Inserting elements into the list is straightforward: traverse to the node at the index and insert a node immediately before that node:
public void add(int index, Object element) {
//check the index is valid first .. code not shown here
Entry e = header; //starting node
//go forwards or backwards depending on which is closer
if (index < size/2) {
for (int i = 0; i <= index; i++)
e = e.next;
} else {
for (int i = size; i > index; i--)
e = e.previous;
}
Entry newEntry = new Entry(element, e, e.previous);
newEntry.previous.next = newEntry;
newEntry.next.previous = newEntry;
size++;
}
|
Thread-safe LinkedLists and other collections If you need a thread-safe LinkedList from the Java
SDK, you can obtain one using a synchronized wrapper from the
This means that a synchronized wrapped LinkedList is
at a significant performance disadvantage to Vector, as Vector does
not require any extra indirection for thread-safety. If you require a
thread-safe LinkedList, you will obtain a faster implementation if you
copy the LinkedList class and make the required methods
synchronized. This is also true of all the other collection classes:
only |
The LinkedList implementation has a performance
overhead for indexed access and update of elements, since access to
any index requires you to traverse multiple nodes. Adding elements to
the collection suffers from the index traversal access performance
drawback and, in addition, has a further overhead in requiring the
creation of a node object. On the plus side, there is no further
overhead to insertions and deletions, so insertions-deletion overhead
is really mostly dependent on how far away the insertion-deletion
index is from the ends of the collection.
Performance tests
There are many different functions of the classes that could be
tested. LinkedLists are frequently used because of their
supposed better performance for random index insertion and deletion,
so I decided to focus on insertion performance, i.e. building
collections. I've tested LinkedList against
ArrayList since both are unsynchronized (see the
"Thread-safe LinkedLists and other collections"
sidebar).
The insertion speed is critically dependent on the size of the collection and the position where the element is to be inserted. All the best and worst performance cases arise when inserting either at one of the ends or at the exact middle point of the collection. Consequently, I've chosen three insertion locations (start, middle, end of the collection), and three representative collection sizes of medium (100 elements), large (10,000 elements), and very large (1,000,000 elements).
For my tests I used the Sun JVMs from the Java SDK 1.2.0 and 1.3.0 release. In addition I also tested with HotSpot JVM 2.0, the version available with the 1.3.0 SDK. All measured times in each table have been normalized to one of the SDK 1.2 VM tests (the cell showing 100% in each table). The default JVM configuraton with JIT enabled was used, though heap space needed to be increased for all JVMs to avoid out-of-memory errors. Times recorded were the averages over several test runs. In order to avoid garbage collection interference, I forced a full scavenge of all memory between each test (see the source code for details). The disk was monitored to ensure that disk paging did not occur (any test runs which showed significant paging were discarded). Each test that was too slow to give a multiple second response time was looped repeatedly until a significantly long enough time was recorded.
Table 1: Building a medium sized collection (100 elements). Figures in brackets are for pre-sized collections |
|||
|
1.2 JVM |
1.3 JVM |
HotSpot 2.0 JVM |
Always inserting at the start of the ArrayList |
100% (48.0%) |
184.9% (152.0%) |
108.0% (66.7%) |
Always inserting at the start of the LinkedList |
135.5% |
109.1% |
85.3% |
Always inserting at the mid-point of the ArrayList |
130.0% (40.6%) |
187.4% (158.0%) |
84.7% (46.0%) |
Always inserting at the mid-point of the LinkedList |
174.0% |
135.0% |
102.3% |
Always inserting at the end of the ArrayList |
63.3% (20.7%) |
65.9% (25.0%) |
60.3% (29.3%) |
Always inserting at the end of the LinkedList |
106.7% |
86.3% |
80.3% |
For short collections, ArrayList and
LinkedList are pretty close in
performance. ArrayLists have the edge when inserting to
the end of the collection, i.e. appending elements. But then appending
elements is the single operation that ArrayList is most
optimized for: if you just want a statically sized collection, a Java
array (e.g. Object[]) gives better performance than any
collection object. Beyond the append operation, measured timings are
very mixed and reflect various JVM optimizations more than anything
else.
For example, for insertions to the start of the collection (first
two rows of table 1), the HotSpot 2.0 JVM with
LinkedLists giving the fastest time (85.3%), and the
second fastest time is the 1.2 JVM with ArrayList
(100%). These two results illustrate that the simple JIT compiler in
1.2 is very efficient at doing simple things like iterating over
and copying arrays. The more sophisticated JVM with optimizing
compiler in HotSpot is able to improve more complex activities such as
object creation (of LinkedList nodes) and take advantage
of code-inlining. The 1.3 JVM results seem to indicate a severe
underperformance in simple operations which is likely to be improved
in later JVM versions.
What I have only partially measured here are other advantages that
ArrayList has over LinkedList, namely the
ability to pre-size collections, and the reduced garbage collection
overhead of ArrayLists. Specifically,
ArrayLists can be created with a particular size (e.g. in
this test the ArrayList could be created with a capacity
of 100 elements), thus avoiding all the growth overheads. The figures
in brackets in table 1 show the improvements possible with pre-sizing
collections. LinkedLists (up to SDK 1.3) cannot be
pre-sized.