Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

when to use iloc and loc for boolean

I'm a bit confusing when using boolean series for indexing for pandas Dataframe. Should I use iloc or loc? or any better solution? for example

t1 = pd.DataFrame(np.ones([3,4]))
t1.iloc[1:3,0]=3

this line will give correct answer

t1.loc[:,(t1>2).any()]

but line with iloc wiill raise an error

t1.iloc[:,(t1>2).any()]

I check https://pandas.pydata.org/pandas-docs/stable/indexing.html, the page says both iloc and loc accept a boolean array. Why does iloc not work in my example? When to use iloc and loc? or are there any better alternatives?

like image 566
AAA Avatar asked Sep 18 '25 17:09

AAA


1 Answers

The nuance is that iloc requires a Boolean array, while loc works with either a Boolean series or a Boolean array. The documentation is technically correct in stating that a Boolean array works in either case.

So, for iloc, extracting the NumPy Boolean array via pd.Series.values will work:

t1.iloc[:, (t1>2).any().values]
like image 168
jpp Avatar answered Sep 20 '25 06:09

jpp