Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Check whether a column exists in another column

I am new to Python and pandas. I have a dataset that has the following structures. It is a pandas DF

city time1              time2
a    [1991, 1992, 1993] [1993,1994,1995]

time1 and time2 represnts the coverage of the data in two sources. I would like create a new column that indicates whether time1 and time2 have any intersection, if so return True otherwise False. The task sound very straightforward. I was thinking about using set operations on the two columns but it did not work as expected. Would anyone help me figure this out?

Thanks!

I appreciate your help.

like image 402
macintosh81 Avatar asked Nov 27 '25 23:11

macintosh81


1 Answers

You can iterate through all the columns and change the lists to sets and see if there is are any values in the intersection.

df1 = df.applymap(lambda x: set(x) if type(x) == list else set([x]))
df1.apply(lambda x: bool(x.time1 & x.time2), axis=1)

This is a semi-vectorized way that should make it run much faster.

df1 = df[['time1', 'time2']].applymap(lambda x: set(x) if type(x) == list else set([x]))
(df1.time1.values & df1.time2.values).astype(bool)

And even a bit faster

change_to_set = lambda x: set(x) if type(x) == list else set([x])
time1_set = df.time1.map(change_to_set).values
time2_set = df.time2.map(change_to_set).values
(time1_set & time2_set).astype(bool)
like image 156
Ted Petrou Avatar answered Dec 01 '25 22:12

Ted Petrou