I am trying to do an indexing within groups in my data.frame
For example, in this data frame
Col1
A
A
A
B
B
C
D
D
D
I would like to the output as the following
Col1 idx
A 1
A 2
A 3
B 1
B 2
C 1
D 1
D 2
D 3
In R, I could simply do the following using data.table df[, idx:=seq_len(.N), by=Col1]. I am having trouble finding the equivalent in Python. So far, I know I can use linspace or arrange function in the numpy's package, but I am not quite sure how to do it by groups.
Thank you in advance.
Use cumcount
In [289]: df['idx'] = df.groupby('Col1').cumcount().add(1)
In [290]: df
Out[290]:
Col1 idx
0 A 1
1 A 2
2 A 3
3 B 1
4 B 2
5 C 1
6 D 1
7 D 2
8 D 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With