I have a data frame with 4 columns (a, b, c, d are column names):
df =
a b c d
1 2 3 4
5 2 7 8
Is it possible to use df.pivot() to get 2 columns into the column multiindex? The following doesn't work:
df.pivot('a', ['b', 'c'])
I want
b 2
c 3 7
a
1 4 NA
5 NA 8
I know I can use pivot_table to get this done easily (pd.pivot_table(df, index='a', columns=['b', 'c'])) but I'm curious about the flexibility of pivot as the documentation isn't clear.
There are obviously missing bits of implementation and I think you've found one. We have work arounds but you are correct, the documentation says that the columns parameter can be an object but nothing seems to work. I trust @MaxU and @jezrael gave it a good try and none of us seem to be able to get it to work as the documentation says is should. I call it bug! I may report it if someone else hasn't already or doesn't before I get to it.
That said, I found this, which is bizarre. I planned on passing a list to the index parameter instead and then transpose. But instead, the strings 'c' and 'b' are used as index values... that isn't at all what I wanted.
What's stranger is this
df.pivot(['c', 'b'], 'a', 'd')
a 1 5
b NaN 8.0
c 4.0 NaN
Also, this looks fine:
df.pivot('a', 'b', 'd')
b 2
a
1 4
5 8
But the error here is confusing
print(df.pivot('a', ['b'], 'd'))
KeyError: 'Level b not found'
The quest continues...
OP's Own Answer
disregard
Using pivot_table
df.pivot_table(values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')
df.pivot_table('d', 'a', ['b', 'c'])
b 2
c 3 7
a
1 4.0 NaN
5 NaN 8.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With