Photo by Annie Spratt on Unsplash

In part1 and part2 of this series, we looked upon Hashtable and hashcode() and equals() contract in Java. One might ask why just not use Hashtable instead of HashMap, it’s because as of Java1.2 this collection has been retrofitted to implement the Map interface and the Java docs suggest using HashMap if a thread-safe implementation is not needed and ConcurrentHashMap if a thread-safe implementation is needed.

Now coming to HashMap it implements Map interface and is part of the Java Collections framework. HashMap allows retrieval of values based on a given key, e.g: “key1”->value1, “key2”->value2. We first insert the values corresponding to a key and later we can retrieve them using the keys.

In short, HashMap is backed by an associative array data structure. During insertion of key and value, the hashcode of the key is calculated, and Entry (key+value) is inserted in the array (based on hashcode % array’s size). If more keys are added with the same hashcode, LinkedList is formed with previously added keys. A good hashCode() implementation is required so that the keys don’t get poorly distributed across the array. Please read about the equals() and hashcode() in part2 of this series.

HashMap implementation

To create a HashMap we import java.util.HashMap package. For initialization,Map<String, String> hash_map = new HashMap<>();

For insertion of keys, hash_map.put("yourKey", "yourValue");

For retrieval of keys, `hash_map.get("yourKey"); //Output — "yourValue"

Now coming to its internal implementation when we insert one key to the map it calculates the hashCode of the given key and maps it to an array bucket using hashCode() % size_of_array . If hashcode() implementation is poor many collisions will be there i.e multiple keys will be mapped to the same array bucket. When multiple hashCode() values end up in the same bucket, values are placed in an ad-hoc linked list. In the worst case, when all keys are mapped to the same bucket, thus degenerating hash map to the linked list - from O(1) to O(n) lookup time.

In Java 8, HashMap replaces the linked list with a binary tree when the number of elements in a bucket reaches a certain threshold i.e TREEIFY_THRESHOLD . If a bucket contains more than eight items, it should become a tree. Conversion of the list to binary tree allows handling worst-case scenarios from O(n) to O(logn) which is a significant improvement in terms of performance. HashMappromotes list into a binary tree, using hash code as a branching variable. If two hashes are different but ended up in the same bucket, one is considered bigger and goes to the right. If hashes are equal (as in our case), HashMap hopes that the keys are comparable, so that it can establish some order.

If entries are removed from the map, the number of entries in the bucket might reduce such that this tree structure is no longer necessary. The UNTREEIFY_THRESHOLD value allows for this conversion to list if elements number gets less than 6. One more factor which affects the conversion of list to tree is MIN_TREEIFY_CAPACITY ,this constant basically says not to start making buckets into trees if our hash map is very small — it should resize to be larger first instead.

Points to note:

And here ends the HashMap implementation, I highly recommend going through the source code which will allow you to see the implementation of this highly used data structure. One thing we can learn from this implementation is there are always trade-offs between performance and space as in this case using trees instead of linked-list gives us performance improvement over space trade-off. In the next part, we will look into various other map implementations like ConcurrentHashMap, LinkedHashMap till then, Ciao!!!

References:

Computers are all I want to know about.