I have a DataFrame in which one column has lists as entries. For a given given value x I want to get a pd.Series of booleans telling me whether x is in each list. For example, given the DataFrame
index lists
0 []
1 [1, 2]
2 [1]
3 [3, 4]
I want to do something like df.lists.contains(1) and get back False, True, True, False.
I am aware I can do this with a Python loop or comprehension, but I would ideally like a Pandas solution analogous to df.mod, df.isin etc.
In [79]: df['lists'].apply(lambda c: 1 in c)
Out[79]:
0 False
1 True
2 True
3 False
Name: lists, dtype: bool
PS I think a list comprehension solution might be faster in this case
Timing for 40.000 rows DF:
In [81]: df = pd.concat([df] * 10**4, ignore_index=True)
In [82]: df.shape
Out[82]: (40000, 2)
In [83]: %timeit df['lists'].apply(lambda c: 1 in c)
22.5 ms ± 87.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [84]: %timeit [1 in x for x in df['lists']]
4.87 ms ± 25.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With