Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cumsum entire table and reset at zero

I have following data frame.

d = pd.DataFrame({'one' : [0,1,1,1,0,1],'two' : [0,0,1,0,1,1]})

d

   one  two
0    0    0
1    1    0
2    1    1
3    1    0
4    0    1
5    1    1

I want cumulative sum which resets at zero

desired output should be

pd.DataFrame({'one' : [0,1,2,3,0,1],'two' : [0,0,1,0,1,2]})

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2

i have tried using group by but it does not work for entire table.

like image 936
onkar Avatar asked Oct 14 '25 03:10

onkar


2 Answers

df2 =  df.apply(lambda x: x.groupby((~x.astype(bool)).cumsum()).cumsum())
print(df2)

Output:

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2
like image 158
Scott Boston Avatar answered Oct 17 '25 07:10

Scott Boston


pandas

def cum_reset_pd(df):
    csum = df.cumsum()
    return (csum - csum.where(df == 0).ffill()).astype(d.dtypes)

cum_reset_pd(d)

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2

numpy

def cum_reset_np(df):
    v = df.values
    z = np.zeros_like(v)
    j, i = np.where(v.T)
    r = np.arange(1, i.size + 1)
    p = np.where(
        np.append(False, (np.diff(i) != 1) | (np.diff(j) != 0))
    )[0]
    b = np.append(0, np.append(p, r.size))
    z[i, j] = r - b[:-1].repeat(np.diff(b))
    return pd.DataFrame(z, df.index, df.columns)

cum_reset_np(d)

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2

Why go through this trouble?
because it's quicker!

enter image description here

like image 26
piRSquared Avatar answered Oct 17 '25 06:10

piRSquared



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!