Map is a common tool from java programmer’s toolbox that is used to store key-value pairs. When such a data structure is needed in our programs, most of the time the code looks similar to:

We know that the class of the key needs to implement int hashCode() and boolean equals(Object obj) methods in order to store and retrieve objects from the HashMap and that’s because:

  1. hashCode() is used to determine the bucket where the key-value pair will be stored
  2. equals(Object obj) is used to determine the right key-value pair when multiple keys have the same hashCode.

I do not remember where I learned these rules but now I know that not from the java API:

  • Map only mentions that

    Implementations are free to implement optimizations whereby the equals invocation is avoided, for example, by first comparing the hash codes of the two keys.

  • HashMap reminds us about hashCode()

    Note that using many keys with the same hashCode() is a sure way to slow down performance of any hash table.

  • Finally Hashtable states it clearly

    To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.

Now that we know what to implement for a class that is used as a key in a map, we may came up with:

At first glance this class looks right, it has both equals and hashCode so what could go wrong? If you look more carefully you’ll notice that it is mutable. In this case it is possible that a key is changed after it is added to the map. If this happens, prepare yourself for severe headaches. Let’s look at the following scenario:

After the key was added to the map, I changed its value. Lookup does not work for both keys:
* key1 is not found because the map contains only one key, key2.
* key2 is not found because it has a different hashCode so the lookup is done in another bucket.

Running the example displays:

If no value is returned, we might realize that we have a problem and fix it. Unfortunately things can go really wrong and we can have a situation where both keys have the same hashCode. In this case the value is returned with the wrong key. The simple way to show this is to change the hashCode implementation with:

and run the example again:

In this case we found the value with the wrong key, key2.

The simplest fix for this problem is to make the class of the key immutable. If that is not possible, make sure that keys are not changed.

Your thoughts are welcome

This site uses Akismet to reduce spam. Learn how your comment data is processed.