For an assignment I have to write the code for a generic Hash Table. In an example Put method, there are two lines:
int hash = key.hashCode(); // get the hashcode of the key
int index = compress(hash); // compress it to an index
I was of the understanding that the hashCode method used the key to return an index, and you would place the key/value pair in the array at that index. But here we "compress" the hash code to get the index. What does this method do? How does it "compress" the hash code? Is it necessary and/or preferred?
The hash code can be any integer between -231 and and 231-1. That's ~4 billion different possible hash codes. If you have, say, 40 hash table buckets, you need to "compress" those 4 billion integers down to the range 0-39.
A common way to do this is with the modulus operator %. a % b returns the remainder after dividing a by b. For example, 7 % 3 == 1.
int compress(int hash) {
return hash % numBuckets;
}
Note: This isn't true in all languages, but in Java the sign of the result equals the sign of the dividend, meaning the result of our calculation above will always be non-negative. In C and C++ this is not the case (the sign is implementation defined), and so one would need to take special care to handle negative hash values correctly.
See integer modulo operators in various programming languages on Wikipedia for a breakdown of how each programming language handles modulus's sign.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With