I was trying to understand the that does the rehashing of hashmap takes place while exceeding the number of buckets occupied or the total number of entries in all buckets. Means, we know that if 12 out of 16 (One entry in each bucket) of buckets are full (Considering default loadfactor and initial capacity), then we know on the next entry the hashmap will be rehashed. But what about that case when suppose only 3 buckets are occupied with 4 entries each (Total 12 entries but only 3 buckets out of 16 in use)?
So I tried to replicate this by making the worst hash function which will put all entries in a single bucket.
Here is my code.
class X {
    public Integer value;
    public X(Integer value) {
        super();
        this.value = value;
    }
    @Override
    public int hashCode() {
        return 1;
    }
    @Override
    public boolean equals(Object obj) {
        X a = (X) obj;
        if(this.value.equals(a.value)) {
            return true;
        }
        return false;
    }
}
Now I started putting values in hashmap.
HashMap<X, Integer> map = new HashMap<>();
    map.put(new X(1), 1);
    map.put(new X(2), 2);
    map.put(new X(3), 3);
    map.put(new X(4), 4);
    map.put(new X(5), 5);
    map.put(new X(6), 6);
    map.put(new X(7), 7);
    map.put(new X(8), 8);
    map.put(new X(9), 9);
    map.put(new X(10), 10);
    map.put(new X(11), 11);
    map.put(new X(12), 12);
    map.put(new X(13), 13);
    System.out.println(map.size());
All the nodes were entering the single bucket as expected, but I noticed that on the 9th entry, the hashmap rehashed and doubled its capacity. Now on the 10th entry, It again doubled its capacity.
 

Can anyone explain how this is happening?
Thanks in advance.
In HashMap, entries will be present in same bucket if their hashcodes are same. If unique Integer objects are placed inside a hashMap, their hashcode will change definitely because they are different objects.
But in your case all the objects are having same hashcode. which means as you specified all entries will be in a single bucket. Each of these buckets follow a specific data structure(linkedList/tree). Here the capacity is changing according to that specific datastructure and hashmap.
I have run JB Nizet's code (https://gist.github.com/jnizet/34ca08ba0314c8e857ea9a161c175f13/revisions) mentioned in the comment with loop limits 64 and 128 (adding 64 and 128 elements):
After increasing capacity to 64, the HashMap works normal.
In summary, bucket uses linked list to a certain length(8 elements). After that it uses tree data structure (where there is fluctuation in capacity). Reason is that accessing tree structure (O(log(n))) is faster than linked list (O(n)).

This picture shows an inner array of a JAVA 8 HashMap with both trees (at bucket 0) and linked lists (at bucket 1,2 and 3). Bucket 0 is a Tree because it has more than 8 nodes (readmore).
Documentation on Hashmap and discussion on bucket in hashmap would be helpful in this regard.
Responding to the comments more than the question itself, since your comments are more relevant in what you want to know actually.
The best and most relevant answer to the where this rehashing on bucket size is explained further is the source code itself. What you observe on the 9-th entry is expected and happens in this portion of the code:
for (int binCount = 0; ; ++binCount) {
    // some irrelevant lines skipped
    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
        treeifyBin(tab, hash);
        break;
    }
where TREEIFY_THRESHOLD = 8 and binCount is the number of bins. 
That treeifyBin method name is a bit misleading since, it might re-size, not treefiy a bin, that's the relevant part of the code from that method:
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
Notice that it will actually resize (read double it's size) and not make a Tree until MIN_TREEIFY_CAPACITY is reached (64).
Read the source code of hashmap,
/** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts * between resizing and treeification thresholds. */ static final int MIN_TREEIFY_CAPACITY = 64;
and you will see
Resize and treeify are two operations which can bring map reorganize, and the above decisions based on different scenarios is also a tradeoff.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With