I have following data frame.
d = pd.DataFrame({'one' : [0,1,1,1,0,1],'two' : [0,0,1,0,1,1]})
d
one two
0 0 0
1 1 0
2 1 1
3 1 0
4 0 1
5 1 1
I want cumulative sum which resets at zero
desired output should be
pd.DataFrame({'one' : [0,1,2,3,0,1],'two' : [0,0,1,0,1,2]})
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
i have tried using group by but it does not work for entire table.
df2 = df.apply(lambda x: x.groupby((~x.astype(bool)).cumsum()).cumsum())
print(df2)
Output:
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
pandasdef cum_reset_pd(df):
csum = df.cumsum()
return (csum - csum.where(df == 0).ffill()).astype(d.dtypes)
cum_reset_pd(d)
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
numpydef cum_reset_np(df):
v = df.values
z = np.zeros_like(v)
j, i = np.where(v.T)
r = np.arange(1, i.size + 1)
p = np.where(
np.append(False, (np.diff(i) != 1) | (np.diff(j) != 0))
)[0]
b = np.append(0, np.append(p, r.size))
z[i, j] = r - b[:-1].repeat(np.diff(b))
return pd.DataFrame(z, df.index, df.columns)
cum_reset_np(d)
one two
0 0 0
1 1 0
2 2 1
3 3 0
4 0 1
5 1 2
Why go through this trouble?
because it's quicker!

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With