Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate average weekly spend with groupby, with week being Monday to Sunday?

Tags:

python

pandas

I have a customer dataframe with purchase amounts and date. In this case I have two customers, A and B:

df1 = pd.DataFrame(index=pd.date_range('2015-04-24', periods = 50)).assign(purchase=[x for x in range(51,101)])
df2 = pd.DataFrame(index=pd.date_range('2015-04-28', periods = 50)).assign(purchase=[x for x in range(0,50)])

df3 = pd.concat([df1,df2], keys=['A','B'])

df3 = df3.rename_axis(['user','date']).reset_index()
print(df3.head())

  user       date  purchase
0    A 2015-04-24        51
1    A 2015-04-25        52
2    A 2015-04-26        53
3    A 2015-04-27        54
4    A 2015-04-28        55

I would just like to know the user's mean weekly spend, with a week being from Monday to Sunday. Expected outcome:

  user       average_weekly_spend 
0    A       51
1    B       60

However I can't figure out how to set it as Monday to Sunday. For now I am using resample with 7D. This means all customers would have a different definition of a week, I think. I believe it takes the 7 days from the first purchase and so on. So every customer will have a different starting date.

df3.groupby('user').apply(lambda x: x.resample('7D', on='date').mean()).groupby('user')['purchase'].mean()


user
A    78.125
B    27.125

Is it possible to define my own week as Monday to Sunday, for all customers?

like image 504
SCool Avatar asked Dec 29 '25 22:12

SCool


2 Answers

It seems you need W-Mon frequency:

df = (df3.groupby('user')
         .resample('W-Mon', on='date')['purchase']
         .mean()
         .mean(level=0)
         .reset_index())
print (df)
  user  purchase
0    A      75.5
1    B      28.7

Not sure if here is good solution use mean of means, maybe you can get counts and sums with resample and then create means by definition - sums divide by counts:

df = (df3.groupby('user')
         .resample('W-Mon', on='date')['purchase']
         .agg(['size','sum'])
         .sum(level=0))
df['mean'] = df.pop('sum') / df.pop('size')
print (df)
      mean
user      
A     75.5
B     24.5
like image 198
jezrael Avatar answered Dec 31 '25 10:12

jezrael


Another solution with to_period, interestingly, gives a different answer:

df3.groupby(['user',df3.date.dt.to_period('W-MON')]).mean().mean(level='user')

Output:

      purchase
user          
A       75.500
B       27.125
like image 35
Quang Hoang Avatar answered Dec 31 '25 11:12

Quang Hoang



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!