I have an indicator variable in my dataframe that takes on the values 1 0 or -1. I'd like to create a new variable that avoids the 0's and instead repeats the nonzero values of the indicator variable until it changes to 1 or -1.
I tried various constructions using the np.where statement, but I cannot solve this problem.
Here is the original dataframe:
import pandas as pd
df = pd.DataFrame(
{'Date': [1,2,3,4,5,6,7,8,9,10],
'Ind': [1,0,0,-1,0,0,0,1,0,0]})
df

I am hoping to get a dataframe that looks like the following:
df2 = pd.DataFrame(
{'Date': [1,2,3,4,5,6,7,8,9,10],
'Ind': [1,0,0,-1,0,0,0,1,0,0],
'NewVar':[1,1,1,-1,-1,-1,-1,1,1,1]})

Use mask and ffill:
df['Ind'].mask(df['Ind'] == 0).ffill()
0 1.0
1 1.0
2 1.0
3 -1.0
4 -1.0
5 -1.0
6 -1.0
7 1.0
8 1.0
9 1.0
Name: Ind, dtype: float64
df['Ind'].mask(df['Ind'] == 0).ffill(downcast='infer')
0 1
1 1
2 1
3 -1
4 -1
5 -1
6 -1
7 1
8 1
9 1
Name: Ind, dtype: int64
Another option is using groupby and transform using a grouper formed from cumsum:
df.groupby(df['Ind'].ne(0).cumsum())['Ind'].transform('first')
0 1
1 1
2 1
3 -1
4 -1
5 -1
6 -1
7 1
8 1
9 1
Name: Ind, dtype: int64
Using reindex
df.Ind[df.Ind!=0].reindex(df.index,method='ffill')
0 1
1 1
2 1
3 -1
4 -1
5 -1
6 -1
7 1
8 1
9 1
Name: Ind, dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With