Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign ID to consecutive groups column r

I would like to produce a column in a data.frame that counts the consecutive id of the groups (s column in dummy df)

dummy_df = data.frame(s = c("a", "a", "b","b", "b", "c","c", "a", "a", "c", "c","a","a"),
                  desired_output= c(1,1,1,1,1,1,1,2,2,2,2,3,3))
dummy_df$rleid_output = rleid(dummy_df$s)
dummy_df
   s desired_output rleid_output
1  a              1            1
2  a              1            1
3  b              1            2
4  b              1            2
5  b              1            2
6  c              1            3
7  c              1            3
8  a              2            4
9  a              2            4
10 c              2            5
11 c              2            5
12 a              3            6
13 a              3            6

I would say it's similar to what rleid() does but restarting the counting when a new group is seen. However, I can't find a way to do it in such straight way. Thanks.

like image 736
Vicky Ruiz Avatar asked Jan 20 '26 00:01

Vicky Ruiz


2 Answers

You can do:

dummy_df$out <- with(rle(dummy_df$s), rep(ave(lengths, values, FUN = seq_along), lengths))

Result:

   s desired_output out
1  a              1   1
2  a              1   1
3  b              1   1
4  b              1   1
5  b              1   1
6  c              1   1
7  c              1   1
8  a              2   2
9  a              2   2
10 c              2   2
11 c              2   2
12 a              3   3
13 a              3   3
like image 152
Ritchie Sacramento Avatar answered Jan 21 '26 14:01

Ritchie Sacramento


If you are willing to use data.table (rleid is part of the package), you can do it in two steps as follows:

library(data.table)
dummy_df = data.frame(s = c("a", "a", "b", "b", "b", "c", "c", "a", "a", "c", "c", "a", "a"))
# cast data.frame to data.table
setDT(dummy_df)
# create auxiliary variable
dummy_df[, rleid_output := rleid(s)]
# obtain desired output
dummy_df[, desired_output := rleid(rleid_output), by = "s"]
# end result
dummy_df
#>     s rleid_output desired_output
#>  1: a            1              1
#>  2: a            1              1
#>  3: b            2              1
#>  4: b            2              1
#>  5: b            2              1
#>  6: c            3              1
#>  7: c            3              1
#>  8: a            4              2
#>  9: a            4              2
#> 10: c            5              2
#> 11: c            5              2
#> 12: a            6              3
#> 13: a            6              3

Created on 2020-10-16 by the reprex package (v0.3.0)

like image 37
Jon Nagra Avatar answered Jan 21 '26 13:01

Jon Nagra



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!