ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Java Design and Performance Optimization

The WeakHashMap Class

07/09/2001

WeakHashMap is a type of Map which differs from other Maps in more than just having a different implementation.

WeakHashMap uses weak references to hold its keys, making it one of the few classes able to respond to the fluctuating memory requirements of the JVM. This can make WeakHashMap unpredictable at times, unless you know exactly what you are doing with it. In the following sections I examine how best to use WeakHashMap, and why WeakHashMap behaves the way it does.

Using WeakHashMap to Avoid Memory Leaks

How many times have you used a Map to store extra information about objects? Perhaps subclassing of those objects were disallowed, but you needed to record extra state information for each object; or there were many different types of objects, and you needed to have a central repository where information was easily accessible about any one of the objects.

Here's a simple example of a class which allows any object to be registered with it, and which keeps track of two items of information about that object, the elapsed time since the object was registered, and a separate counter.


public class ExtraInformation
{
  static Map ExtraInfo = new HashMap();

  long registrationTime;
  int countSomething = 0;

  public static void registerObject(Object o)
  {
    ExtraInfo.put(o, new ExtraInformation())
  }

  public ExtraInformation()
  {
    //record time when first registered
    registrationTime = System.currentTimeMillis();
  }

  public static void incrementCount(Object o)
  {
    ((ExtraInformation) ExtraInfo.get(o)).incrementCount++;
  }

  public static int getCount(Object o)
  {
    return ((ExtraInformation) ExtraInfo.get(o)).incrementCount;
  }

  public static long getElapseTimeMillis(Object o)
  {
    return System.currentTimeMillis() - 
      ((ExtraInformation) ExtraInfo.get(o)).registrationTime;
  }
}

Related Reading

Java Performance TuningJava Performance Tuning
By Jack Shirazi
Table of Contents
Index
Sample Chapter
Full Description
Read Online -- Safari

If you are alert, or thinking about the title of this section, you may have immediately noticed that a deregistration method is missing. Without a deregistration method, the internal Map in the ExtraInformation class keeps on growing, and retains a reference to the object key. This is a classic memory leak; objects never get dereferenced, so the garbage collector can never reclaim them. So lets add the deregistration method:

public static void deregisterObject(Object o)
{
  ExtraInfo.remove(o);
}

Now we can avoid the memory leak. But what if the users of this class forget to call the deregister method? What if there is no particular time when it would be appropriate to call the deregister method? Well, we can put a big warning in the documentation -- "WARNING: MEMORY LEAK IF DEREGISTRATION IS NOT PERFORMED!". Or we can try to be nicer and handle the situation ourselves.

Specifically, we want to handle the situation where the deregisterObject() method is not called, but the registered object is no longer referenced anywhere in the application, except by our internal Map. This situation is precisely what WeakHashMap was designed for. The keys entered into a WeakHashMap are held using WeakReferences. This allows the garbage collector to collect the keys if there are no other strong references to those keys elsewhere in the application (for more details about strong and weak references in Java, see the "Further resources" section at the end of this article). After the keys have been collected, the values are also dereferenced from the WeakHashMap, and can also be garbage collected if they are not referred to elsewhere in the application.

To enable our ExtraInformation class to work with the garbage collector in avoiding memory leaks, the change is simple -- make the internal Map a WeakHashMap:

static Map ExtraInfo = new WeakHashMap();

Comment on this articleSend your questions to Jack.
Post your comments

Also in Java Design and Performance Optimization:

Micro-Tuning Step-by-Step

Tuning JDBC: Measuring JDBC performance

Faster List Iteration with RandomAccess Interface

With this one change, we have now enabled our ExtraInformation class to automatically release objects to the garbage collector once those objects are no longer referenced anywhere else in the application. (There can be other classes using weak references in a similar way without interfering with this functionality. No amount of weak references to an object will prevent that object from being eligible for collection by the garbage collector).

Using WeakHashMap as a Cache

The previous section looked at using WeakHashMap to avoid memory leaks. WeakHashMap can also be used fairly straightforwardly as a cache that can be cleared automatically when JVM memory is low. The only difference is in needing the keys to be objects which can be equal to other keys of the same class. This is most easily explained with some examples:

public class SparseArrayCache {
  protected WeakHashMap map = new WeakHashMap();

  /* This put() will return any previous value already at index i. */
  public Object put(int i, Object o) {
    map.put(new Integer(i), o);
  }

  /* This get() will return the object at index i, or null if no
   object was put there, or if the object was garbage collected. */
  public Object get(int i) {
    return map.get(new Integer(i));
  }

  public void remove(int i, Object o) {
    map.put(new Integer(i), o);
  }
}

Sparse Arrays

When you have a large array where most of the elements are not used, this is often referred to as a sparse array. For example, perhaps only elements at index 232 and 9986 of a 10,000 element array are non-zero, and all other elements are zero:

int[] arr = new int[10000];
arr[232] = 18;
arr[9986] = 17;

Using a full 10,000-element array makes the lookup very fast, but wastes a huge amount of space. Efficient memory implementations for sparse arrays use structures to hold only the non-zero elements. A hash table with integer keys provides one such implementation.

However, using WeakHashMap to implement a cache has a real problem with lack of control. Most builders of caches usually find that they want to control how and when objects are removed from the cache. For example, a least-recently-used algorithm tries to retain elements in the cache that are being used the most often and removes those elements that have not been used recently. But the elements of a WeakHashMap cache could be cleared at any time by the JVM, and there is no selectivity. Normally all the elements which can be cleared, are cleared in one go. This takes control away from the cache-builder. A more useful cache would be signalled by the JVM that memory was needed and would then be allowed to select which elements to clear. It is possible to take more control over cache clearing with your own implementation directly using WeakReferences, but that wouldn't use a WeakHashMap, so I'll leave that discussion to another article.

How WeakHashMap works

You don't need to know the internals of WeakHashMaps to use them. But the implementation is interesting, and studying the it can make you aware of some of the curious performance consequences of working with WeakHashMaps.

The keys in a WeakHashMap are WeakReference objects. The object passed as the key to a WeakHashMap is stored as the referent of the WeakReference object, and the value is the standard Map value. (The object returned by calling Reference.get() is termed the referent of the Reference object.) A comparison with HashMap can help:


HashMap

Map h = new HashMap();

Object key = new Object;
h.put(key, "xyz");

key = null;

WeakHashMap

Map h = new WeakHashMap();

Object key = new Object;
h.put(key, "xyz");

key = null;

Conceptually, this is similar to inserting a line before the put() call like this:

key = new WeakReferenkey(key);

The key is referenced directly by the HashMap.

The key is not referenced directly by the WeakHashMap. Instead, a WeakReference object is referenced directly by the WeakHashMap, and the key is referenced weakly from the WeakReference object.

The value is referenced directly by the HashMap.

The value is referenced directly by the HashMap.

The key is not garbage collectable, since the map contains a strong reference to the key. The key could be obtained by iterating over the keys of the HashMap.

The key is garbage collectable, as nothing else in the application refers to it, and the WeakReference only holds the key weakly. Iterating over the keys of the WeakHashMap might obtain the key, but might not, if the key has been garbage collected.

The value is similarly not garbage collectable.

The value is not directly garbage collectable. However, when the key is collected by the garbage collector, the WeakReference object is subsequently removed from the WeakHashMap as a key, thus making the value garbage collectable too.

The (1.2- and 1.3-version) WeakHashMap implementation wraps a HashMap for its underlying Map implementation, and wraps keys with WeakReferences (actually a WeakReference subclass) before putting the keys into the underlying HashMap. The WeakHashMap uses its own ReferenceQueue object so that it is notified of keys that have been garbage collected, thus allowing the timely removal of the WeakReference objects and the correponding values. The queue is checked whenever the Map is altered. If you have not worked with Reference objects and ReferenceQueues before, this can be a little confusing, so I'll work through an example. The following example adds a key-value pair to the WeakHashMap, assumes that the key is garbage collected, and records the subsequent procedure followed by the WeakHashMap.

  1. A key value pair is added to the Map:

    aWeakHashMap.put(key, value);

    This results in the addition of a WeakReference key added to the WeakHashMap, with the original key held as the referent of the WeakReference object. You could do the equivalent using a HashMap like this:

    ReferenceQueue Queue = new ReferenceQueue();
    MyWeakReference RefKey = new MyWeakReference(key, Queue);
    aHashMap.put(RefKey, value);

    (For the equivalence code, I've used a subclass of WeakReference, as I'll need to override the WeakReference.equals() for equal key access in the subsequent points to work correctly.) Note that at this stage the referent of the WeakReference just created is the original key; i.e., the following statement would output "true":

    System.out.println(RefKey.get() == key);
  2. At this point, you could access the value from the WeakHashMap using the original key, or another key which is equal to the original key; i.e., the following statements would now output "true":

    System.out.println(aWeakHashMap.get(equalKey) == value);
    System.out.println(aWeakHashMap.get(key) == value);

    In our equivalent code using the HashMap, the following statements would now output "true":

    MyWeakReference RefKey2 = new MyWeakReference(equalKey, Queue);
    System.out.println(aHashMap.get(RefKey2) == value);
    System.out.println(aHashMap.get(RefKey) == value);

    Note that in order to get this equivalence, we need to implement equals() and hashcode() in the MyWeakReference class, so that equal referents make equal MyWeakReference objects. This is necessary so that the MyWeakReference wrapped keys evaluate as equal keys in Maps. The code is available from the source (see "Further Resources" at the end of this article). The equals() method returns true if the MyWeakReference objects are identical or if their referents are equal.

  3. We now null out the reference to the original key:

    key = null;

    After some time, the garbage collector detects that the key is no longer referenced anywhere else in the application, and clears the WeakReference key. In the equivalent code using the HashMap from this point on, the WeakReference we created has a null referent; i.e., the following statement would now output "true":

    System.out.println(RefKey.get() == null);

    Maintaining a reference to the WeakReference object (in the RefKey variable) doesn't affect clearing the referent. In the WeakHashMap, the WeakReference object key is also strongly referenced from the map, but its referent, the original key, is cleared.

  4. The garbage collector adds the WeakReference that it recently cleared into its ReferenceQueue; that queue is the ReferenceQueue object, which was passed in to the constructor of the WeakReference.

  5. Trying to retrieve the value using a key equal to the original key would now return null. (To try this it is necessary to use a key equal to the original key; we no longer have access to the original key, otherwise it could not have been garbage collected). The following statement would now output "true":

    System.out.println(aWeakHashMap.get(equalKey) == null);

    In our equivalent code using the HashMap, the following statements would now output "true":

    MyWeakReference RefKey3 = new MyWeakReference(equalKey, Queue);
    System.out.println(aHashMap.get(RefKey3) == null);

  6. However, at the moment the WeakReference and the value objects are still strongly referenced by the Map. That is where the ReferenceQueue comes in. Recall that when the garbage collector clears the WeakReference, it adds the WeakReference into the ReferenceQueue. Now that it is in the ReferenceQueue, we need to have it processed. In the case of the WeakHashMap, the WeakReference stays in the ReferenceQueue until the WeakHashMap is altered in some way, e.g. by calling put(), remove(), or clear(), etc. Once one of the mutator methods have been called, the WeakHashMap runs through its ReferenceQueue, removing all WeakReference objects from the queue and also removing each WeakReference object as a key in its internal map, thus simultaneously dereferencing the value. In the following example, I use a dummy object to force queue processing without making any real changes to the WeakHashMap:

    aWeakHashMap.put(DUMMY_OBJ, DUMMY_OBJ);

    The equivalent code using the HashMap does not need a dummy object, but we need to carry out the equivalent queue processing:

    MyWeakReference aRef;
    while ((aRef = (MyWeakReference) Queue.poll()) != null)
    {
      aHashMap.remove(aRef);
    }

    As you see, we take each WeakReeference out of the queue, and remove it from the Map. This also releases the corresponding value object, and both the WeakReference object and the value object can now be garbage collected if there are no other strong references to them.

Reference objects with String literal referents

Note that if you use a string literal as a key to a WeakHashMap, or the referent to a Reference object, it will not necessarily be garbage collected when the application no longer references it. For example, consider the code:

String s = "hello";
WeakHashMap h = new WeakHashMap();
h.put(s,"xyz");
s = null;

You might expect that the string "hello" can now be garbage collected, since we have nulled the reference to it. However, a string created as a literal is created at compile time, and held in a string pool in the class file. The JVM normally holds these strings internally in its own string pool after the class has been loaded. Consequently, the JVM retains a strong reference to the String object, and it will not be garbage collected until the JVM releases it from the string pool; that may be never, or it may be when the class is garbage collected, or some other time. If you want to use a String object as a key to a WeakHashMap, ensure it is created at runtime, e.g.:

String s1 = new String("hello");
String s2 = (new StringBuffer()).append("hello").toString();

This string does not get put into the JVM string pool, and so can be garbage collected when the application no longer holds strong references to it. Note that calling String.intern() on a string will also put it into the internal JVM string pool, giving rise to the same issues as literal strings. Similarly, other objects which the JVM could retain a strong reference to, such as Class objects, may also not be garbage collected when there are no longer any strong references to them from the application.

Some Consequences of the WeakHashMap Implementation

Further Resources

The source code for testing code equivalent to WeakHashMap is in Test.java.

Strong and weak references:

Java performance tuning

Jack Shirazi is the author of Java Performance Tuning. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance.


Read more Java Design and Performance Optimization columns.

Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.