Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boolean indexing with loc returns NaN

import pandas as pd
numbers = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]

df = pd.DataFrame(numbers)
condition = df.loc[:, 1:2] < 4
df[condition]
    0   1   2
0   NaN 2.0 3.0
1   NaN NaN NaN
2   NaN NaN NaN

Why am I getting these wrong results, and what can I do to get the correct results?

like image 649
newbieeyo Avatar asked Dec 16 '25 13:12

newbieeyo


1 Answers

Boolean condition has to be Series, but here your selected columns return DataFrame:

print (condition)
       1      2
0   True   True
1  False  False
2  False  False

So for convert boolean Dataframe to mask use DataFrame.all for test if all Trues per rows or DataFrame.any if at least one True per rows:

print (condition.any(axis=1))
print (condition.all(axis=1))
0     True
1    False
2    False
dtype: bool

Or select only one column for condition:

print (df.loc[:, 1] < 4)
0     True
1    False
2    False
Name: 1, dtype: bool

print (df[condition.any(axis=1)])
   0  1  2
0  1  2  3
like image 187
jezrael Avatar answered Dec 19 '25 05:12

jezrael