I am trying to check for duplicates.
I use df['name_duplicated'] = df.duplicated('name', keep=False)
However, this treats any row with name = NaN as a duplicate.
Does anyone know how to get around this?
I am trying df[pd.isnull(df['name'])]['name_duplicated'] = False but I get an error.
You could try also checking for NaNs and doing a boolean and operation on the results of the duplicated call
df['name_duplicated'] = df.duplicated('name', keep=False) & df['name'].notnull()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With