I have a data set with a column filled with 1s and 0s as such:
column 1:
1
1
1
1
0
0
0
1
1
0
1
1
and I am currently using np.where() to create a new column that segments the data from column 1. The issue is, whenever it comes across a new segments of 1s I want the values in column 2 to increase by 1.
n = 1
df['column2'] = np.where(df['column1'] == 1 , n + 1 , 0)
The results I am getting:
column 1: column 2:
1 1
1 1
1 1
1 1
0 0
0 0
0 0
1 1
1 1
0 0
1 1
1 1
what I am trying to achieve
column 1: column 2:
1 1
1 1
1 1
1 1
0 0
0 0
0 0
1 2
1 2
0 0
1 3
1 3
Cheeky one-liner:
df["column2"] = df['column1'].diff().ne(0).cumsum().add(1).floordiv(2).where(df['column1'].astype(bool), other=0)
df:
column1 column2
0 1 1
1 1 1
2 1 1
3 1 1
4 0 0
5 0 0
6 0 0
7 1 2
8 1 2
9 0 0
10 1 3
11 1 3
You can do it like this also:
df['column 2:'] = (df.assign(grp = (df['column 1:'].diff() != 0).cumsum())
.where(df['column 1:'].astype(bool))
.groupby('grp').ngroup().add(1).fillna(0))
Output:
column 1: column 2:
0 1 1.0
1 1 1.0
2 1 1.0
3 1 1.0
4 0 0.0
5 0 0.0
6 0 0.0
7 1 2.0
8 1 2.0
9 0 0.0
10 1 3.0
11 1 3.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With