In pandas, I'd like to create a computed column that's a boolean operation on two other columns.
In pandas, it's easy to add together two numerical columns. I'd like to do something similar with logical operator AND. Here's my first try:
In [1]: d = pandas.DataFrame([{'foo':True, 'bar':True}, {'foo':True, 'bar':False}, {'foo':False, 'bar':False}])  In [2]: d Out[2]:       bar    foo 0   True   True 1  False   True 2  False  False  In [3]: d.bar and d.foo   ## can't ... ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). So I guess logical operators don't work quite the same way as numeric operators in pandas. I tried doing what the error message suggests and using bool():
In [258]: d.bar.bool() and d.foo.bool()  ## spoiler: this doesn't work either ... ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). I found a way that works by casting the boolean columns to int, adding them together and evaluating as a boolean.
In [4]: (d.bar.apply(int) + d.foo.apply(int)) > 0  ## Logical OR Out[4]:  0     True 1     True 2    False dtype: bool  In [5]: (d.bar.apply(int) + d.foo.apply(int)) > 1  ## Logical AND Out[5]:  0     True 1    False 2    False dtype: bool This is convoluted. Is there a better way?
The operators are: | for or , & for and , and ~ for not . These must be grouped by using parentheses, since by default Python will evaluate an expression such as df. A > 2 & df. B < 3 as df.
To get all combinations of columns we will be using itertools. product module. This function computes the cartesian product of input iterables. To compute the product of an iterable with itself, we use the optional repeat keyword argument to specify the number of repetitions.
Grouping by Multiple ColumnsYou can do this by passing a list of column names to groupby instead of a single string value.
Yes there is a better way! Just use the & element-wise logical and operator:
d.bar & d.foo  0     True 1    False 2    False dtype: bool Also, there exists another one you could just multiply for AND or add for OR. Without the conversion and extra comparison as you had done.
AND operation:
d.foo * d.bar OR operation:
d.foo + d.bar  If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With