pandas duplication removing nans

Question

I am trying to check for duplicates. I use df['name_duplicated'] = df.duplicated('name', keep=False) However, this treats any row with name = NaN as a duplicate.

Does anyone know how to get around this?

I am trying df[pd.isnull(df['name'])]['name_duplicated'] = False but I get an error.

philngo · Accepted Answer

You could try also checking for NaNs and doing a boolean and operation on the results of the duplicated call

df['name_duplicated'] = df.duplicated('name', keep=False) & df['name'].notnull()

pandas duplication removing nans

Tags:

python

pandas

duplicates

python-2.7

As3adTintin

1 Answers

philngo

Recent Activity

Donate For Us

pandas duplication removing nans

Tags:

python

pandas

duplicates

python-2.7

As3adTintin

1 Answers

philngo

Related questions

Recent Activity

Donate For Us