Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - divide each row by a group average

I am trying so solve a somewhat simple task but it is not clear how to achieve it in pandas.

So I have a pandas dataframe which has a set of columns I am interested in. Set of columns is stored in factors list:

#get df
df = pd.read_sql(sql=sqlString, con = engine)

#shuffle
df = df.sample(frac=1, random_state=123).reset_index(drop=True)

#list of fields we want
factors = ['GRP_RANK', 'BK_YIELD', 'SALES_YIELD', 'EARNINGS_YIELD_LTM', 'CASHFLOW_YIELD', 'ROE', 'ROIC',
           'ROA', 'GROSS_MGN', '12MVT', '1MVT', 'BETA_3Y', 'BETA_1Y', 'P_TOTAL_RETURN(-1,0,USD)']

Now, there is column DATE in dataframe. For each of the factors for each record I want to divide the value of factor by the average of the factor value on particular date.

I managed to obtain the averages by day for each factor:

dfGroup = df[factors + ["DATE"]].groupby('DATE')[factors].mean()

But I am not sure how to proceed. Only thing which comes to my mind is get new big dataframe by left joining df and dfGroup by DATE field and then doing some ugle column by column division, but maybe there is a way to do it easier?

like image 532
Maksim Khaitovich Avatar asked Oct 23 '25 15:10

Maksim Khaitovich


1 Answers

Let's look at using groupby and transform with div:

MVCE:

df = pd.DataFrame({'Date':pd.date_range('2018-02-10','2018-02-12',freq='H'),'A':np.random.randint(0,100,49),'B':np.random.randint(100,200,49),'C':np.random.random(49)})

df = df.set_index('Date')

print(df.head())

Output:

                      A    B         C
Date                                  
2018-02-10 00:00:00  11  131  0.474226
2018-02-10 01:00:00  35  188  0.998742
2018-02-10 02:00:00  97  182  0.683685
2018-02-10 03:00:00   0  134  0.845094
2018-02-10 04:00:00  24  173  0.238379

Use groupby, transfrom and div:

df[['A','B','C']].div(df.groupby(df.index.floor('D')).transform('mean'))

Output head():

                        A         B         C
Date                                             
2018-02-10 00:00:00  0.362637  0.866593  0.931739
2018-02-10 01:00:00  1.153846  1.243660  1.962284
2018-02-10 02:00:00  3.197802  1.203969  1.343275
2018-02-10 03:00:00  0.000000  0.886439  1.660404
2018-02-10 04:00:00  0.791209  1.144432  0.468357
like image 60
Scott Boston Avatar answered Oct 25 '25 03:10

Scott Boston