I have a dataset, df, where I would like to groupby two columns, take the sum and count of another column as well as list the strings in a separate column
Data
id  date    pwr type
aa  q321    10  hey
aa  q321    1   hello
aa  q425    20  hi
aa  q425    20  no
bb  q122    2   ok
bb  q122    1   cool
bb  q422    5   sure
bb  q422    5   sure
bb  q422    5   ok
Desired
id  date    pwr count   type
aa  q321    11  2       hey
                        hello
aa  q425    40  2       hi
                        no
bb  q122    3   2       ok
                        cool
bb  q422    15  3       sure
                        sure
                        ok
Doing
g = df.groupby(['id', 'date'])['pwr'].sum().reset_index()
g['count'] = g['id'].map(df['id'].value_counts())
This works ok, except, I am not sure how to display the string output of column 'type' Any suggestion is appreciated.
You can use .GroupBy.transform() to set the values for columns pwr and count. Then .set_index() on the 4 columns except type to get a layout similar to the desired output:
df['pwr'] = df.groupby(['id', 'date'])['pwr'].transform('sum')
df['count'] = df.groupby(['id', 'date'])['pwr'].transform('count')
df.set_index(['id', 'date', 'pwr', 'count'])
Output:
                    type
id date pwr count       
aa q321 11  2        hey
            2      hello
   q425 40  2         hi
            2         no
bb q122 3   2         ok
            2       cool
   q422 15  3       sure
            3       sure
            3         ok
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With