Emulate Excel AverageIFs with Pandas

Question

I am trying to emulate Excel's AVERAGEIFs function in Pandas on a date range, and so far have been unsuccessful. I understand that I need to use apply and groupby, but I obviously don't have the syntax correct as I receive this error:

TypeError: can only concatenate list (not "Timedelta") to list

I am using test data at the moment while I try to figure out the syntax, and the data is below:

enter image description here

For each 'Avg' column I am trying to return the average quantity for the previous 180 days as grouped by 'A' and 'B'. Therefore, I would expect the 'Avg' column in row 1 to be 1.5 ( (1+2)/2, while leaving out row 5 which is a match but is > 180 days prior).

Here is the code I have at present, which does not work: import pandas as pd

#Importing the dataset
df = pd.read_excel('Test.xlsx', sheet_name='Sheet1')

df = pd.concat([df, pd.DataFrame(columns=['Avg Qty'])], axis=1)
df['Avg Qty'] = df.apply(df.groupby([(['Date'] <= (['Date']+pd.Timedelta(-1, 
    unit='d')) >= (['Date']+pd.Timedelta(-180, unit='d'))), 'A', 'B']))['Qty'].mean()

print(df.head)

Any help would be greatly appreciated.

Scott Boston · Accepted Answer

IIUC, I think you want something like this:

df['Avg Qty'] = (df.groupby([pd.Grouper(freq='180D', key='Date'),'A','B'])['Qty']
                   .transform('mean'))

Output:

        Date       A         B  Qty  Cost  Avg Qty
0 2017-12-11  Cancer      Golf    1   100      1.5
1 2017-11-11  Cancer      Golf    2   200      1.5
2 2017-11-11  Cardio      Golf    2   300      2.0
3 2017-10-11  Cardio  Baseball    3   600      3.0
4 2017-04-11  Cancer      Golf    4   150      4.0
5 2016-01-01  Cancer  Football    5   200      5.0

Edit:

df = df.set_index('Date')
df.groupby(['A','B']).apply(lambda x: x.sort_index().rolling('180D')['Qty'].mean()).reset_index()\
   .merge(df.reset_index(), on=['Date','A','B'], suffixes=('_avg',''))

Output:

        A         B       Date  Qty_avg  Qty  Cost
0  Cancer  Football 2016-01-01      5.0    5   200
1  Cancer      Golf 2017-04-11      4.0    4   150
2  Cancer      Golf 2017-11-11      2.0    2   200
3  Cancer      Golf 2017-12-11      1.5    1   100
4  Cardio  Baseball 2017-10-11      3.0    3   600
5  Cardio      Golf 2017-11-11      2.0    2   300

Emulate Excel AverageIFs with Pandas

Tags:

python

pandas

pandas-groupby

battery514

1 Answers

Edit:

Scott Boston

Recent Activity

Donate For Us

Emulate Excel AverageIFs with Pandas

Tags:

python

pandas

pandas-groupby

battery514

1 Answers

Edit:

Scott Boston

Related questions

Recent Activity

Donate For Us