Say I have the following dataframe:
import pandas as pd
df = pd.DataFrame()
df['A'] = ('1/05/2019','2/05/2019','3/05/2019','4/05/2019','5/05/2019','6/05/2019','7/05/2019','8/05/2019','9/05/2019','10/05/2019','11/05/2019','12/05/2019','13/05/2019','14/05/2019','15/05/2019','16/05/2019','17/05/2019','18/05/2019','19/05/2019','20/05/2019')
df['B'] = ('SIT','SCLOSE', 'SHODL', 'SHODL', 'SHODL', 'SHODL', 'SHODL', 'SELL','SIT','SIT','BCLOSE', 'BHODL', 'BHODL', 'BHODL', 'BHODL', 'BHODL', 'BHODL','BUY','SIT','SIT')
df['C'] = (0.00,1.00,10.00, 5.00,6.00,-6.00, 6.00, 0.00,0.00,0.00,-8.00,33.00,-15.00,6.00,-1.00,5.00,10.00,0.00,0.00,0.00)
df.loc[19, 'D'] = 100.0000
As can be seen I am starting column D with 100 at the last row.
I am trying to code a calculation for column D so starting from the bottom row (row 19) when a BUY or SELL is shown on column B then the number on column D is locked (eg the 100) and used for a calculation based on col C for each SHODL or BHODL until the row after a BCLOSE or an SCLOSE is shown.
The locked number is used to calculate a running balance based on the percentages that are in column C. As you can see on row 16 column C has '10' representing 10%. As 10% of 100 = 10 the new runnning balance is 110.
Row 15 column C has 5% as such 5 is added to the running balance to result in 115.
The next row 14 column C has a -1% change as such 1% of 100 is = 1 and therefore the new running balance is 114 and so on.
The following are the results that should be returned in col D of the dataframe once the right code is run
df['D'] = ('158.60','158.60', '157.30', '144.30', '137.80', '130.00', '137.80', '130.00','130.00','130.00','130.00', '138.00', '105.00', '120.00', '114.00', '115.00', '110.00','100.00','100.00','100.00')
This continues until after a SCLOSE or a BCLOSE is shown as a BCLOSE or SCLOSE row is the final row where the running balance is calculated.
As you can see this process is restarted when either a new BUY or SELL is shown.
Next starting value depends on the last value of previous group, so I think it can't be vectorized. It requires some kind of iterative process. I came up with solution doing iteratively on groups of groupby. Reverse df and assign to df1. Working on each group of df1 and assign the final list of groups to the original df
df1 = df[::-1]
s = df1.B.isin(['BCLOSE','SCLOSE']).shift(fill_value=False).cumsum()
grps = df1.groupby(s)
init_val= 100
l = []
for _, grp in grps:
s = grp.C * 0.01 * init_val
s.iloc[0] = init_val
s = s.cumsum()
init_val = s.iloc[-1]
l.append(s)
df['D'] = pd.concat(l)
Out[50]:
A B C D
0 1/05/2019 SIT 0.0 158.6
1 2/05/2019 SCLOSE 1.0 158.6
2 3/05/2019 SHODL 10.0 157.3
3 4/05/2019 SHODL 5.0 144.3
4 5/05/2019 SHODL 6.0 137.8
5 6/05/2019 SHODL -6.0 130.0
6 7/05/2019 SHODL 6.0 137.8
7 8/05/2019 SELL 0.0 130.0
8 9/05/2019 SIT 0.0 130.0
9 10/05/2019 SIT 0.0 130.0
10 11/05/2019 BCLOSE -8.0 130.0
11 12/05/2019 BHODL 33.0 138.0
12 13/05/2019 BHODL -15.0 105.0
13 14/05/2019 BHODL 6.0 120.0
14 15/05/2019 BHODL -1.0 114.0
15 16/05/2019 BHODL 5.0 115.0
16 17/05/2019 BHODL 10.0 110.0
17 18/05/2019 BUY 0.0 100.0
18 19/05/2019 SIT 0.0 100.0
19 20/05/2019 SIT 0.0 100.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With