Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pivot a DataFrameGroupBy panadas object

I have a DataFrameGroupBy object called 'grouped' that looks like this:

for key, item in grouped:
    print('key: {0}, value: {1}'.format(key, grouped.get_group(key)))

key: 9909, value:                    date  quantity
0  2018-01-28 00:00:00+00:00       2.3
1  2018-01-29 00:00:00+00:00       3.0
key: 1151, value:                  date_period  quantity
2 2018-01-28 00:00:00+00:00       5.0
3 2018-01-29 00:00:00+00:00       9.0

I am trying to turn this into a dataframe that looks like this:

id,   day1,   day2
9909  2.3     3.0
1151  5.5     9.0

When I run the following:

item_df = grouped.to_frame()

I get the following error:

'DataFrameGroupBy' object has no attribute 'to_frame'

When I run the following

df = grouped.reset_index(inplace=True) 

I get the following:

Cannot access callable attribute 'reset_index' of 'DataFrameGroupBy' objects, try using the 'apply' method.

The DataFrameGroupBy object that I am trying to reformat was created from a dataframe like this:

data = {'created_date': ['2018-01-22 12:40:03', '2018-01-22 13:40:03', '2018-01-23 15:00:05', '2018-01-26 14:30:04'], 
     'quantity': [11, 21, 23, 12], 'id': ['543', '543', '842', '543']}
    df = pd.DataFrame(data, columns = ['created_date', 'quantity' , 'id']) 
df.index = df['created_date']
df.index = pd.to_datetime(df.index)
g = df.groupby('id').resample('D')['quantity'].sum()
df = g.to_frame()
dates = pd.date_range(df.index.levels[1].min(), g.index.levels[1].max()) 
idx = pd.MultiIndex.from_product([df.index.levels[0], dates])
df= df.reindex(idx, fill_value=0)
df = df.fillna(0)
df.reset_index(inplace=True) 
df.rename(columns={'level_0': 'id', 'level_1': 'date'}, inplace=True)
grouped = df.groupby('id')
#now to pivot this groupby object...
like image 402
proximacentauri Avatar asked Jan 25 '26 23:01

proximacentauri


1 Answers

It is possible by apply:

def f(x):
    return (x.pivot('id','date','quantity'))

grouped = df.groupby('id', group_keys=False).apply(f)
print (grouped)
date  2018-01-22  2018-01-23  2018-01-24  2018-01-25  2018-01-26
id                                                              
543         32.0         0.0         0.0         0.0        12.0
842          0.0        23.0         0.0         0.0         0.0

but it is same output as:

print (df.pivot('id','date','quantity'))
date  2018-01-22  2018-01-23  2018-01-24  2018-01-25  2018-01-26
id                                                              
543         32.0         0.0         0.0         0.0        12.0
842          0.0        23.0         0.0         0.0         0.0
like image 177
jezrael Avatar answered Jan 27 '26 11:01

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!