I am trying to run analysis on a merged dataframe. Currently, my data is something like this:
data2 = pd.DataFrame({'county':['A', 'B', 'C', 'D'],
'1990pop': [100, 200, 150, 400],
'1990price': [35, 20, 10, 60],
'1991pop': [110, 210, 160, 410],
'1991price': [7, 9, 8, 15]})
data2.set_index('county')
I want to create a MultiIndex so for for each row, you see the county, and then each county has an entry for a 'population' and a 'price' variable, and only one column for each for each year 1990 and 1991.
Here's what I'd like the dataframe to look like
I've tried various merging and MultiIndex guides but this is a bit beyond my ability. I'd greatly appreciate any help!
Is this what you need ? wide_to_long + unstack + stack
yourdf=pd.wide_to_long(data2,['1990','1991'],i=['county'],j='year',suffix='\w+').unstack().stack(1)
yourdf
Out[57]:
1990 1991
county year
A pop 100 110
price 35 7
B pop 200 210
price 20 9
C pop 150 160
price 10 8
D pop 400 410
price 60 15
columns.str.extract with regexdata2.set_axis(
[*data2.columns.str.extract('(\d{4})(.*)').values.T],
1, inplace=False
).stack().rename_axis(['country', 'type'])
1990 1991
country type
A pop 100 110
price 35 7
B pop 200 210
price 20 9
C pop 150 160
price 10 8
D pop 400 410
price 60 15
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With