I have a data frame df with columns ID and N1, I would like to calculate column N2, with logic first value should be equal to N1 for each ID and the next value is 0.888/0.999 and so on. And similarly for next ID. Can we do this WITHOUT using for loop in pandas
ID N1 N2
1111 0.999 0.999
1111 0.888 0.888888889
1111 0.777 0.875
1111 0.666 0.857142857
1111 0.555 0.833333333
1111 0.444 0.8
1111 0.333 0.75
2222 0.998 0.998
2222 0.887 0.888777555
2222 0.776 0.874859076
2222 0.665 0.856958763
2222 0.554 0.833082707
2222 0.443 0.799638989
2222 0.332 0.749435666
2222 0.221 0.665662651
This is 1 plus the percentage change
df.assign(N2=df.groupby('ID').N1.pct_change().add(1).fillna(df.N1))
ID N1 N2
0 1111 0.999 0.999000
1 1111 0.888 0.888889
2 1111 0.777 0.875000
3 1111 0.666 0.857143
4 1111 0.555 0.833333
5 1111 0.444 0.800000
6 1111 0.333 0.750000
7 2222 0.998 0.998000
8 2222 0.887 0.888778
9 2222 0.776 0.874859
10 2222 0.665 0.856959
11 2222 0.554 0.833083
12 2222 0.443 0.799639
13 2222 0.332 0.749436
14 2222 0.221 0.665663
Yes, you can use groupby(), transform() and shift() then fillna(1) to allow for that first value.
df['N2'] = df.groupby("ID")['N1'].transform(lambda x: x/x.shift(1).fillna(1))
df
ID N1 N2
0 1111 0.999 0.999000
1 1111 0.888 0.888889
2 1111 0.777 0.875000
3 1111 0.666 0.857143
4 1111 0.555 0.833333
5 1111 0.444 0.800000
6 1111 0.333 0.750000
7 2222 0.998 0.998000
8 2222 0.887 0.888778
9 2222 0.776 0.874859
10 2222 0.665 0.856959
11 2222 0.554 0.833083
12 2222 0.443 0.799639
13 2222 0.332 0.749436
14 2222 0.221 0.665663
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With