Say I have this dataframe:
col1 col2
'a' [1,2,3]
'a' [1,2,3]
'b' [4,5,6]
and I want to drop the duplicates (in this case the first two rows). How would I accomplish this in a time efficient Pythonic manner (my full dataframe is millions of rows and 7 columns)
you can try converting to something hashable and then drop
inplace=True will overwrite your database
df["col2"] = df["col2"].transform(lambda k: tuple(k))
df.drop_duplicates(inplace=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With