Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I filter the second index of a multi-index dataframe?

Tags:

python

pandas

I know similar questions have been asked but I can't seem to find a solution to this one.

With the following code I can filter out using the columns and the first index but not the second.

import pandas as pd
import numpy as np
ix = pd.MultiIndex.from_product([ ['foo', 'bar'], ['baz', 
'can']], names=['a', 'b'])
data = np.arange(len(ix))
df = pd.DataFrame(data, index=ix, columns=['values'])
df['values2']=[1,4,5,6]
print(df)

the resulting output is as follows:

enter image description here

Notice how the last line, does not work

df.loc['foo','can']['values2']   # works
df.loc['foo']['values2']         # works
df.loc['foo','can'][:]           # works
df.loc['foo',:][:]               # works
df.loc[:,'can'][:]               # does not work. 
like image 935
Chris L Avatar asked Oct 26 '25 04:10

Chris L


1 Answers

Use slicers for more complicated selections:

idx = pd.IndexSlice

print (df.loc[idx['foo', 'can'], 'values'])
1

print (df.loc[idx['foo'], 'values'])
b
baz    0
can    1
Name: values, dtype: int32

print (df.loc[idx['foo',:], 'values'])
a    b  
foo  baz    0
     can    1
Name: values, dtype: int32

print (df.loc[idx['foo','can'], :])
values     1
values2    4
Name: (foo, can), dtype: int64

print (df.loc[idx['foo',:], :])
         values  values2
a   b                   
foo baz       0        1
    can       1        4

print (df.loc[idx[:, 'can'], :])
         values  values2
a   b                   
foo can       1        4
bar can       3        6
like image 54
jezrael Avatar answered Oct 28 '25 20:10

jezrael