Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a sequence of numbers that increments for every change in another variable

Tags:

r

What is an efficient way to create a sequence of numbers that increments for each change in a group variable? As a toy example, using the data frame below, I would like a new variable, "Value", to take on the values c(1,1,1,2,2,3,3,4). Note that even though 48 repeats itself, "Value" still increases as I'm only concerned with a change in the sequence.

df <- read.table(textConnection(
  'Group 
  48 
  48
  48
  56
  56
  48
  48
  14'), header = TRUE)

One way to do this is

df$Value<-1
for(i in 2:nrow(df)){
if(df[i,]$Group==df[i-1,]$Group){df[i,]$Value=df[i-1,]$Value}
else{df[i,]$Value=df[i-1,]$Value+1}
}

but this is very slow. My actual dataset has several million observations.

Note: I had a difficult time wording the title of this question so please change it if you'd like.

like image 215
Remy M Avatar asked Oct 21 '25 17:10

Remy M


1 Answers

We also could hack the rle.

r <- rle(df$Group)
r$values <- seq_along(r$lengths)
inverse.rle(r)
# [1] 1 1 1 2 2 3 3 4

Data

df <- structure(list(Group = c(48L, 48L, 48L, 56L, 56L, 48L, 48L, 14L
)), class = "data.frame", row.names = c(NA, -8L))
like image 73
jay.sf Avatar answered Oct 24 '25 08:10

jay.sf