ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.


AddThis Social Bookmark Button

The WeakHashMap Class
Pages: 1, 2

Sparse Arrays

When you have a large array where most of the elements are not used, this is often referred to as a sparse array. For example, perhaps only elements at index 232 and 9986 of a 10,000 element array are non-zero, and all other elements are zero:

int[] arr = new int[10000];
arr[232] = 18;
arr[9986] = 17;

Using a full 10,000-element array makes the lookup very fast, but wastes a huge amount of space. Efficient memory implementations for sparse arrays use structures to hold only the non-zero elements. A hash table with integer keys provides one such implementation.

However, using WeakHashMap to implement a cache has a real problem with lack of control. Most builders of caches usually find that they want to control how and when objects are removed from the cache. For example, a least-recently-used algorithm tries to retain elements in the cache that are being used the most often and removes those elements that have not been used recently. But the elements of a WeakHashMap cache could be cleared at any time by the JVM, and there is no selectivity. Normally all the elements which can be cleared, are cleared in one go. This takes control away from the cache-builder. A more useful cache would be signalled by the JVM that memory was needed and would then be allowed to select which elements to clear. It is possible to take more control over cache clearing with your own implementation directly using WeakReferences, but that wouldn't use a WeakHashMap, so I'll leave that discussion to another article.

How WeakHashMap works

You don't need to know the internals of WeakHashMaps to use them. But the implementation is interesting, and studying the it can make you aware of some of the curious performance consequences of working with WeakHashMaps.

The keys in a WeakHashMap are WeakReference objects. The object passed as the key to a WeakHashMap is stored as the referent of the WeakReference object, and the value is the standard Map value. (The object returned by calling Reference.get() is termed the referent of the Reference object.) A comparison with HashMap can help:


Map h = new HashMap();

Object key = new Object;
h.put(key, "xyz");

key = null;


Map h = new WeakHashMap();

Object key = new Object;
h.put(key, "xyz");

key = null;

Conceptually, this is similar to inserting a line before the put() call like this:

key = new WeakReferenkey(key);

The key is referenced directly by the HashMap.

The key is not referenced directly by the WeakHashMap. Instead, a WeakReference object is referenced directly by the WeakHashMap, and the key is referenced weakly from the WeakReference object.

The value is referenced directly by the HashMap.

The value is referenced directly by the HashMap.

The key is not garbage collectable, since the map contains a strong reference to the key. The key could be obtained by iterating over the keys of the HashMap.

The key is garbage collectable, as nothing else in the application refers to it, and the WeakReference only holds the key weakly. Iterating over the keys of the WeakHashMap might obtain the key, but might not, if the key has been garbage collected.

The value is similarly not garbage collectable.

The value is not directly garbage collectable. However, when the key is collected by the garbage collector, the WeakReference object is subsequently removed from the WeakHashMap as a key, thus making the value garbage collectable too.

The (1.2- and 1.3-version) WeakHashMap implementation wraps a HashMap for its underlying Map implementation, and wraps keys with WeakReferences (actually a WeakReference subclass) before putting the keys into the underlying HashMap. The WeakHashMap uses its own ReferenceQueue object so that it is notified of keys that have been garbage collected, thus allowing the timely removal of the WeakReference objects and the correponding values. The queue is checked whenever the Map is altered. If you have not worked with Reference objects and ReferenceQueues before, this can be a little confusing, so I'll work through an example. The following example adds a key-value pair to the WeakHashMap, assumes that the key is garbage collected, and records the subsequent procedure followed by the WeakHashMap.

  1. A key value pair is added to the Map:

    aWeakHashMap.put(key, value);

    This results in the addition of a WeakReference key added to the WeakHashMap, with the original key held as the referent of the WeakReference object. You could do the equivalent using a HashMap like this:

    ReferenceQueue Queue = new ReferenceQueue();
    MyWeakReference RefKey = new MyWeakReference(key, Queue);
    aHashMap.put(RefKey, value);

    (For the equivalence code, I've used a subclass of WeakReference, as I'll need to override the WeakReference.equals() for equal key access in the subsequent points to work correctly.) Note that at this stage the referent of the WeakReference just created is the original key; i.e., the following statement would output "true":

    System.out.println(RefKey.get() == key);
  2. At this point, you could access the value from the WeakHashMap using the original key, or another key which is equal to the original key; i.e., the following statements would now output "true":

    System.out.println(aWeakHashMap.get(equalKey) == value);
    System.out.println(aWeakHashMap.get(key) == value);

    In our equivalent code using the HashMap, the following statements would now output "true":

    MyWeakReference RefKey2 = new MyWeakReference(equalKey, Queue);
    System.out.println(aHashMap.get(RefKey2) == value);
    System.out.println(aHashMap.get(RefKey) == value);

    Note that in order to get this equivalence, we need to implement equals() and hashcode() in the MyWeakReference class, so that equal referents make equal MyWeakReference objects. This is necessary so that the MyWeakReference wrapped keys evaluate as equal keys in Maps. The code is available from the source (see "Further Resources" at the end of this article). The equals() method returns true if the MyWeakReference objects are identical or if their referents are equal.

  3. We now null out the reference to the original key:

    key = null;

    After some time, the garbage collector detects that the key is no longer referenced anywhere else in the application, and clears the WeakReference key. In the equivalent code using the HashMap from this point on, the WeakReference we created has a null referent; i.e., the following statement would now output "true":

    System.out.println(RefKey.get() == null);

    Maintaining a reference to the WeakReference object (in the RefKey variable) doesn't affect clearing the referent. In the WeakHashMap, the WeakReference object key is also strongly referenced from the map, but its referent, the original key, is cleared.

  4. The garbage collector adds the WeakReference that it recently cleared into its ReferenceQueue; that queue is the ReferenceQueue object, which was passed in to the constructor of the WeakReference.

  5. Trying to retrieve the value using a key equal to the original key would now return null. (To try this it is necessary to use a key equal to the original key; we no longer have access to the original key, otherwise it could not have been garbage collected). The following statement would now output "true":

    System.out.println(aWeakHashMap.get(equalKey) == null);

    In our equivalent code using the HashMap, the following statements would now output "true":

    MyWeakReference RefKey3 = new MyWeakReference(equalKey, Queue);
    System.out.println(aHashMap.get(RefKey3) == null);

  6. However, at the moment the WeakReference and the value objects are still strongly referenced by the Map. That is where the ReferenceQueue comes in. Recall that when the garbage collector clears the WeakReference, it adds the WeakReference into the ReferenceQueue. Now that it is in the ReferenceQueue, we need to have it processed. In the case of the WeakHashMap, the WeakReference stays in the ReferenceQueue until the WeakHashMap is altered in some way, e.g. by calling put(), remove(), or clear(), etc. Once one of the mutator methods have been called, the WeakHashMap runs through its ReferenceQueue, removing all WeakReference objects from the queue and also removing each WeakReference object as a key in its internal map, thus simultaneously dereferencing the value. In the following example, I use a dummy object to force queue processing without making any real changes to the WeakHashMap:

    aWeakHashMap.put(DUMMY_OBJ, DUMMY_OBJ);

    The equivalent code using the HashMap does not need a dummy object, but we need to carry out the equivalent queue processing:

    MyWeakReference aRef;
    while ((aRef = (MyWeakReference) Queue.poll()) != null)

    As you see, we take each WeakReeference out of the queue, and remove it from the Map. This also releases the corresponding value object, and both the WeakReference object and the value object can now be garbage collected if there are no other strong references to them.

Reference objects with String literal referents

Note that if you use a string literal as a key to a WeakHashMap, or the referent to a Reference object, it will not necessarily be garbage collected when the application no longer references it. For example, consider the code:

String s = "hello";
WeakHashMap h = new WeakHashMap();
s = null;

You might expect that the string "hello" can now be garbage collected, since we have nulled the reference to it. However, a string created as a literal is created at compile time, and held in a string pool in the class file. The JVM normally holds these strings internally in its own string pool after the class has been loaded. Consequently, the JVM retains a strong reference to the String object, and it will not be garbage collected until the JVM releases it from the string pool; that may be never, or it may be when the class is garbage collected, or some other time. If you want to use a String object as a key to a WeakHashMap, ensure it is created at runtime, e.g.:

String s1 = new String("hello");
String s2 = (new StringBuffer()).append("hello").toString();

This string does not get put into the JVM string pool, and so can be garbage collected when the application no longer holds strong references to it. Note that calling String.intern() on a string will also put it into the internal JVM string pool, giving rise to the same issues as literal strings. Similarly, other objects which the JVM could retain a strong reference to, such as Class objects, may also not be garbage collected when there are no longer any strong references to them from the application.

Some Consequences of the WeakHashMap Implementation

  • Reference clearing is automatic; consequently, there is no need to worry about achieving some sort of corrupt state if you try to access an object and the garbage collector is clearing keys at the same time. You will either get the object or you won't.

  • The values are not released until the WeakHashMap is altered. This is specific to the implementation, and was done to avoid getting ConcurrentModificationExceptions. If the ReferenceQueue were processed on an accessor, this could be from an Iterator accessing elements. Then even if there was only one thread running, it would be possible for the map to be altered as you iterated through it, giving a surprising ConcurrentModificationException. Specifically one of the mutator methods, put(), remove(), or clear(), need to be called directly or indirectly (e.g. from putAll()) for the values to be released by the WeakHashMap. If you do not call any mutator methods after populating the WeakHashMap, the values and WeakReference objects will never be dereferenced.

  • WeakHashMap (at least up to version 1.3) wraps an internal HashMap. This means that practically every call to the WeakHashMap has one extra level of indirection it must go through (e.g. WeakHashMap.get() calls HashMap.get()), which can be a significant performance overhead. This is a specific choice of the implementation.

  • Every call to get() creates a new WeakReference object to enable equality testing of keys in the internal HashMap. Although these are small short-lived objects, if get() was used intensively this could generate a heavy performance overhead. Again this is a specific choice of the implementation. The WeakHashMap could implement a hash table directly in the class and avoid the need for extra WeakReference objects.

  • Unlike many other collections, WeakHashMap cannot maintain a count of elements, since keys can be cleared at any time by the garbage collector without immediately notifying the WeakHashMap. This means that seemingly simple methods such as isEmpty() and size() have more complicated implementations than for most collections. Specifically, size() actually iterates thorugh the keys, counting those that have not been cleared. Consequently size() is an operation that takes time proportional to the size of the WeakHashMap. Similarly, isEmpty() iterates through the collection looking for a non-null key. This produces the perverse result that a WeakHashMap which is empty, due to having all its keys cleared, requires more time for isEmpty() to return than a similar WeakHashMap that is not empty.

Further Resources

The source code for testing code equivalent to WeakHashMap is in Test.java.

Strong and weak references:

Java performance tuning

Jack Shirazi is the author of Java Performance Tuning. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance.

Read more Java Design and Performance Optimization columns.

Return to ONJava.com.