Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to number groups of randomly repeated values into sets of n unique values?

Tags:

r

group-by

How can I create a new grp variable that divides row into groups of n unique row values each, within type? e.g. below, for n = 4, the first four subgroups of repeated -or unique- values of row are grp 1 (red), the next four subgroups of repeated -or unique- values of row are grp 2 (blue), and so on. The ideal would be a function allowing n to be changed as desired, but not necessarily.

Knowing that, within type, row is always in an ascending order but not necessarily continuously, and that the number of its repetitions can vary randomly.

Edit: in addition, to secure in the case where n is not an exact multiple of 4 for a given type (see the exchanges of the answers provided), grp would return NA for all this type, ideally.

Note: here is a small example, but my database has thousands of rows and types, with much more repetitions.

Initial and desired data:
enter image description here

Initial data:

dat0 <-
structure(list(type = c("a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b"), row = c(5, 
5, 6, 8, 8, 8, 10, 11, 11, 11, 13, 13, 14, 14, 18, 18, 18, 3, 
4, 4, 4, 6, 6, 7, 7, 7, 9, 9, 10, 10, 10, 12, 16, 16, 21, 22, 
22, 22, 23, 23, 28, 28, 28, 28)), row.names = c(NA, -44L), class = c("tbl_df", 
"tbl", "data.frame"))
like image 300
denis Avatar asked Oct 26 '25 13:10

denis


1 Answers

library(dplyr)

dat0 %>% 
  distinct() %>% 
  mutate(id = gl(n()/4, 4)) %>% 
  right_join(dat0)

#> # A tibble: 44 × 3
#>    type    row id   
#>    <chr> <dbl> <fct>
#>  1 a         5 1    
#>  2 a         5 1    
#>  3 a         6 1    
#>  4 a         8 1    
#>  5 a         8 1    
#>  6 a         8 1    
#>  7 a        10 1    
#>  8 a        11 2    
#>  9 a        11 2    
#> 10 a        11 2    
#> # ℹ 34 more rows

Created on 2025-02-11 with reprex v2.1.1

like image 190
M-- Avatar answered Oct 29 '25 05:10

M--