I have the following column B that I would like to use to get columns C and D (where B + C = D):
B C D
1 0 1
2 0 2
3 0 3
0 3 3
0 3 3
0 3 3
1 3 4
2 3 5
3 3 6
0 6 6
1 6 7
Unfortunately, I'm having some trouble figuring out the best way to go about this.
I've tried using various combinations of .groupby(), rolling apply, .ffill() and .cumsum() to no avail.
I've been able to split up B using
x = (df['B'].ne(0).diff() & df['B'].shift().ne())
but otherwise I'm pretty stumped.
Any help would be greatly appreciated!
Assuming you really want the results, you can achieve your dataframe using this logic. You can go straight from B to D using cumsum on a boolean series:
df = pd.DataFrame({'B':[1,2,3,0,0,0,1,2,3,0,1]})
df['D'] = df['B'].ne(0).cumsum()
df['C'] = df['D'] - df['B']
df[['B', 'C', 'D']]
Output:
B C D
0 1 0 1
1 2 0 2
2 3 0 3
3 0 3 3
4 0 3 3
5 0 3 3
6 1 3 4
7 2 3 5
8 3 3 6
9 0 6 6
10 1 6 7
I'm assuming you want to "group" the dataframe if B is 0 or not. I don't think there's nice vectorized solution to this, but you can try (if performance is problem, you can optimize it via numba):
out_C, out_D = [], []
flag, current_value = None, 0
for B in df["B"]:
match [flag, fl := B > 0]:
case [None, _]:
flag = fl
case [True, False] | [False, True]:
current_value = out_D[-1]
flag = fl
out_C.append(current_value)
out_D.append(B + current_value)
df["new_C"] = out_C
df["new_D"] = out_D
print(df)
Prints:
B C D new_C new_D
0 1 0 1 0 1
1 2 0 2 0 2
2 3 0 3 0 3
3 0 3 3 3 3
4 0 3 3 3 3
5 0 3 3 3 3
6 1 3 4 3 4
7 2 3 5 3 5
8 3 3 6 3 6
9 0 6 6 6 6
10 1 6 7 6 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With