What is an efficient way to create a sequence of numbers that increments for each change in a group variable? As a toy example, using the data frame below, I would like a new variable, "Value", to take on the values c(1,1,1,2,2,3,3,4). Note that even though 48 repeats itself, "Value" still increases as I'm only concerned with a change in the sequence.
df <- read.table(textConnection(
'Group
48
48
48
56
56
48
48
14'), header = TRUE)
One way to do this is
df$Value<-1
for(i in 2:nrow(df)){
if(df[i,]$Group==df[i-1,]$Group){df[i,]$Value=df[i-1,]$Value}
else{df[i,]$Value=df[i-1,]$Value+1}
}
but this is very slow. My actual dataset has several million observations.
Note: I had a difficult time wording the title of this question so please change it if you'd like.
We also could hack the rle.
r <- rle(df$Group)
r$values <- seq_along(r$lengths)
inverse.rle(r)
# [1] 1 1 1 2 2 3 3 4
Data
df <- structure(list(Group = c(48L, 48L, 48L, 56L, 56L, 48L, 48L, 14L
)), class = "data.frame", row.names = c(NA, -8L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With