Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use pandas groupby function to apply a formula based on the groupby value

My question may be a little confusing, so let me explain. I have a dataframe of information that I would like to group by the unique order id that will produce the following columns:

sum qty = the total amount that was executed per order id. csv = this is the sum of the csv column per order id divided by the sum of the executed amount of the order id.

The first column is easy to create with groupby, it's the second column that I am having issues with. Here is sample data that I am working with:

    qty     sym     price   ordrefno    ord_bidprice    ord_askprice    csv
0   -25000  TEST    0.044   984842      0.0435          0.044          12.5
1   100     TEST    0.0443  984702      0.0435          0.044          0.03
2   -10000  TEST    0.0405  983375      0.039           0.0405         15
3   -100    TEST    0.0443  984842      0.0435          0.044          0.03

This is my code:

cs1 = lambda x: np.sum(test.csv / test.qty)
f2 = {'qty' : ['sum'], 'csv' : {'es' : cs1}}

agg_td = trades.groupby('ordrefno').agg(f2)
like image 361
Chris Avatar asked Oct 25 '25 15:10

Chris


1 Answers

Writing a named funtion and using apply works:

def func(group):
    sum_ = group.qty.sum()
    es = (group.csv / group.qty).sum()
    return pd.Series([sum_, es], index=['qty', 'es'])

trades.groupby('ordrefno').apply(func)

Result:

            qty     es
ordrefno               
983375   -10000 -0.0015
984702      100  0.0003
984842   -25100 -0.0008
like image 126
Mike Müller Avatar answered Oct 27 '25 04:10

Mike Müller