I've been beating my head against a wall trying to figure out the best approach on this question. I have a SLOW backtester in Excel that I'm trying to port to pandas. I thought I'd be able to leverage the power of python rather than merely recreate the Excel functions in python, but I'm stuck!
The key challenge is in portfolio P&L calculation for multiple and dynamically changing instruments over time. For example, based on criteria, I want to buy instrument A and C for the first N periods. Then I want to use B and C for the next N periods. And so on. I want to buy proportional amounts of each instrument based on current equity (as opposed to buying 100 shares each time or something).
The Excel, row-based function calculates shares first based on initial equity, but then based on equity available "in the previous row". Something like: IF(RebalancePeriod = TRUE, EquityFromPrevRow / CurrentSharePrice, PreviousRowShares)
Attempting panda-ize it. Here are example inputs:
import datetime as dt
import pandas as pd
from pandas import *
dates = date_range('1/1/2000', periods=6)
index=dates
data = {'A': pd.Series([20, 30, 10, 0, 0, 0], index=index)
, 'B': pd.Series([0, 0, 0, 50, 51, 52], index=index)
, 'C': pd.Series([11, 12, 20, 18, 17, 19], index=index)}
initial_capital = 5000.0
prices = pd.DataFrame(data, index=dates)
That doesn't get you very far, but here are the desired outputs. In this example, I want to rebalance instruments by swapping from A to B at the 4th row, 2000-01-04, where A=0 thereafter.
Prices
A B C
2000-01-01 20 0 11
2000-01-02 30 0 12
2000-01-03 10 0 20
2000-01-04 0 50 18
2000-01-05 0 51 17
2000-01-06 0 52 19
Shares (initial or current equity / price)
A B C
2000-01-01 250.0 0.0 454.5
2000-01-02 250.0 0.0 454.5
2000-01-03 250.0 0.0 454.5
2000-01-04 0.0 115.9 322.0
2000-01-05 0.0 115.9 322.0
2000-01-06 0.0 115.9 322.0
Equity (shares * price)
A B C Total
2000-01-01 5,000.0 0.0 5,000.0 10,000.0
2000-01-02 7,500.0 0.0 5,454.5 12,954.5
2000-01-03 2,500.0 0.0 9,090.9 11,590.9
2000-01-04 0.0 5,795.5 5,795.5 11,590.9
2000-01-05 0.0 5,911.4 5,473.5 11,384.8
2000-01-06 0.0 6,027.3 6,117.4 12,144.7
I realize there's a lot in this one. I appreciate any help. Thank you.
Just follow the same idea that you do in the Excel, the only difference is that the excel function applied to each row is now expressed as a loop:
In [113]:
print prices
A B C
2000-01-01 20 0 11
2000-01-02 30 0 12
2000-01-03 10 0 20
2000-01-04 0 50 18
2000-01-05 0 51 17
2000-01-06 0 52 19
In [114]:
swap=pd.Series([False,]*len(dates),index=dates, name='sawp')
swap[3]=True
total=pd.DataFrame({'A':0, 'B':0, 'C':0},index=dates)
total.ix[0]=[5000,5000,5000]
shares=total/prices
shares['swap']=swap
In [115]:
#the loop
for idx in range(1, len(shares)):
if shares.ix[idx, 'swap']:
shares.ix[idx, ['A','B','C']]=(shares.ix[idx-1, ['A','B','C']]*prices.ix[idx-1]).sum()/2/prices.ix[idx]
else:
shares.ix[idx, ['A','B','C']]=shares.ix[idx-1, ['A','B','C']]
In [116]:
#Get rid of the infinite numbers
shares[shares==np.inf]=0
In [117]:
print shares
A B C swap
2000-01-01 250 0.000000 454.545455 False
2000-01-02 250 0.000000 454.545455 False
2000-01-03 250 0.000000 454.545455 False
2000-01-04 0 115.909091 321.969697 True
2000-01-05 0 115.909091 321.969697 False
2000-01-06 0 115.909091 321.969697 False
In [118]:
print shares*prices
A B C swap
2000-01-01 5000 0.000000 5000.000000 NaN
2000-01-02 7500 0.000000 5454.545455 NaN
2000-01-03 2500 0.000000 9090.909091 NaN
2000-01-04 0 5795.454545 5795.454545 NaN
2000-01-05 0 5911.363636 5473.484848 NaN
2000-01-06 0 6027.272727 6117.424242 NaN
In [119]:
#Total asserts
print (shares*prices).sum(1)
2000-01-01 10000.000000
2000-01-02 12954.545455
2000-01-03 11590.909091
2000-01-04 11590.909091
2000-01-05 11384.848485
2000-01-06 12144.696970
Freq: D, dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With