Multi level pandas groupby

Question

I need to maintain position of "each scrip per team per account". So I think pandas groupby might be used. I have tried to state my problem in comment of dataframe aswell. The position column has been added manually by me.

Initial df I have

             time account scrip  buy_price  sell_price  qty   team  
0  06/07/17 09:36      A1  FUT1       50.0         NaN    2  team1         
1  06/07/17 09:46      A2  FUT1      100.0         NaN    2  team1         
2  06/07/17 09:56      A3  FUT1       10.0         NaN    2  team2         
3  06/07/17 09:57      A3  FUT1        NaN        10.0    2  team2         
4  06/07/17 09:58      A1  FUT1        NaN        50.0    1  team1         
5  06/07/17 09:59      A3  FUT1        NaN        50.0    1  team2

I need to add position. The position column has been added manually by me and in the comment I write additional remark for clarity on how position is calculated.

             time account scrip  buy_price  sell_price  qty   team  position                                    comment
0  06/07/17 09:36      A1  FUT1       50.0         NaN    2  team1         2  this can only be sold by team1 account A1
1  06/07/17 09:46      A2  FUT1      100.0         NaN    2  team1         2  this can only be sold by team1 account A2
2  06/07/17 09:56      A3  FUT1       10.0         NaN    2  team2         2  this can only be sold by team2 account A3
3  06/07/17 09:57      A3  FUT1        NaN        10.0    2  team2         0                 sold by team 2 in  acc A3 
4  06/07/17 09:58      A1  FUT1        NaN        50.0    1  team1         1                  sold by team 1 in acc A1 
5  06/07/17 09:59      A3  FUT1        NaN        50.0    1  team2        -1                   sold by team 2 in acc A3

The above example is for just 1 scrip FUT1, there would be many scrips. My end result would be something similar to.

Team Account Scrip Position
team1 A1      FUT1 1 
      A2      FUT1 2
team2 A3      FUT1 -1

End result can be worked on later, once position is calculated successfully. My approach: making a unique key in each row so as to know when to + or - position. eg. A1_FUT1_team1 for row1 A2_FUT1_team1 for row2. Then add subtract matching keys. Is this anyway a good approach?

Rayhane Mama · Accepted Answer

Your problem can be solved easily in two steps:

First Step:

import math
df['some_stuff'] = df.apply(lambda x: -x.qty if math.isnan(x.buy_price) else x.qty,axis=1)

this line is creating a new column some_stuff, why I did this is just to introduce some logic of gain and loss in your data.

if you don't want a new column and you like the idea just replace qty column with it like this:

df['qty'] = df.apply(lambda x: -x.qty if math.isnan(x.buy_price) else x.qty,axis=1)

next, I use this new column to create your position column as follows:

df['position'] = df.groupby(['team','account','scrip'])['some_stuff'].cumsum()

which generates this column:

bonus:

if you want to delete the extra column some_stuff just use:

del df['some_stuff']

Second Step:

This is the step where you get your final grouped table with this line:

print(df.groupby(['team', 'account', 'scrip']).min())

final output:

                               time  buy_price  sell_price  qty  position
team  account scrip                                                      
team1 A1      FUT1   06/07/17 09:36       50.0        50.0    1         1
      A2      FUT1   06/07/17 09:46      100.0         NaN    2         2
team2 A3      FUT1   06/07/17 09:56       10.0        10.0    1        -1

I believe this answers your questions.

Documentation:

pandas.DataFrame.apply

pandas.Groupby

pandas.DataFrame.cumsum

pandas.DataFrame.min

Multi level pandas groupby

Tags:

python

pandas

pandas-groupby

pythonRcpp

1 Answers

Rayhane Mama

Recent Activity

Donate For Us

Multi level pandas groupby

Tags:

python

pandas

pandas-groupby

pythonRcpp

1 Answers

Rayhane Mama

Related questions

Recent Activity

Donate For Us