Time efficient way of dropping duplicates in a large dataframe of different types

Question

Say I have this dataframe:

col1 col2

'a' [1,2,3]

'b' [4,5,6]

and I want to drop the duplicates (in this case the first two rows). How would I accomplish this in a time efficient Pythonic manner (my full dataframe is millions of rows and 7 columns)

woblob · Accepted Answer

you can try converting to something hashable and then drop

inplace=True will overwrite your database

df["col2"] = df["col2"].transform(lambda k: tuple(k))
df.drop_duplicates(inplace=True)

Time efficient way of dropping duplicates in a large dataframe of different types

Tags:

python

pandas

dataframe

numpy

Hanuman95

1 Answers

woblob

Recent Activity

Donate For Us

Time efficient way of dropping duplicates in a large dataframe of different types

Tags:

python

pandas

dataframe

numpy

Hanuman95

1 Answers

woblob

Related questions

Recent Activity

Donate For Us