Here is my class which I intentionally made it so that hash collision occurs:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f'({self.name} - {self.age})'
def __eq__(self, other):
print('__eq__ called for', self)
return False
def __hash__(self):
print('__hash__ called for', self)
return 1
obj1 = Person('Soroush', 25)
obj2 = Person('John', 23)
print('\n--------Creation of set---------\n')
s = {obj1, obj2}
print(s)
print('\n-------Membership testing-------\n')
print(obj2 in s)
output:
--------Creation of set---------
__hash__ called for (Soroush - 25)
__hash__ called for (John - 23)
__eq__ called for (Soroush - 25)
{(Soroush - 25), (John - 23)}
-------Membership testing-------
__hash__ called for (John - 23)
__eq__ called for (Soroush - 25)
True
*My question is from "Membership testing" part of the output:
In obj2 in s statement, python first calculates the hash of obj2 (good, first line of output), then because obj1 and obj2 have the same hash, python is going to see if they are the same items or not so __eq__ of the obj1 is called(second line of the output). They are not ! I intentionally returned False for all instances. So python is continue checking the probe sequence. Here again the hashes are equal! why python doesn't call __eq__ of obj2 to see if those items are equal or not ?
I expected to see __eq__ called for (John - 23) in third line and also because obj2 is not equal to obj2 , last line of the output should be False. like:
print(obj2 == obj2)
output :
__eq__ called for (John - 23)
False
So, why there is no __eq__ called for (John - 23) and why True? They are not the same items(same thing happened a second ago with obj1) .
All built-in container types check object identity before testing (potentially expensive) object equality. This behavior is documented:
For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to
any(x is e or x == e for e in y).
Of course, that isn't how hash-based containers actually check membership. Again, this is an optimization, this spares potentially expensive equality operations, consider checking the equality of two large strings.
Here's a built-in that works the same way, the famously problematic floating point NaN:
>>> nan = float('nan')
>>> nan == nan
False
>>> nan is nan
True
>>> nan is float('nan')
False
>>> nan == float('nan')
False
And this results in the following behavior with sets:
>>> nan = float('nan')
>>> myset = {nan}
>>> nan in myset
True
>>> float('nan') in myset
False
Note, it is false when we create a new float object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With