Count "gaps" between observations in R

Question

I am having trouble with replicating a project that was done in Stata within R. One of the key snags I'm hitting is that I need to generate a variable that counts the number of years since a certain observation. Here's a simple recreation of what the data might look like:

data <- cbind(1960:1970, c(NA, NA, 22, NA, NA, NA, 24, NA, NA, NA, 22), c(NA, NA, NA, NA, NA, NA, 4, NA, NA, NA, 4))

      [,1] [,2] [,3]
 [1,] 1960   NA   NA
 [2,] 1961   NA   NA
 [3,] 1962   22   NA
 [4,] 1963   NA   NA
 [5,] 1964   NA   NA
 [6,] 1965   NA   NA
 [7,] 1966   24    4
 [8,] 1967   NA   NA
 [9,] 1968   NA   NA
[10,] 1969   NA   NA
[11,] 1970   22    4

I currently have the first two columns of data and I'm trying to automate the creation of column three with a function.

You can see that the third column is defined by the number of years between when values of the second column are not NAs but only after the first occurrence of the intervention (i.e. the second time column two has a value, but not the first).

If it's any help, here is the code in Stata that does this trick, where since is the third column in my simplified example. Basically this code is saying create new variable since that is defined as the number of years since there is a value in variable redist (second column in my example) after the first year there is a value in variable redist.

gen since=.
foreach n of numlist 1(1)10 {
    replace since = year - year[_n-`n'] if redist!=. & redist[_n-`n']!=. & since==.
}

Thanks for the help in advance!

Rich Scriven · Accepted Answer

You can add a column of NA values, then fill in the differences with a logical vector. This assumes we begin with only the first two columns.

data <- cbind(data, NA)
nona <- !is.na(data[,2])
data[,3][nona] <- c(NA, diff(data[,1][nona]))

data
#      [,1] [,2] [,3]
# [1,] 1960   NA   NA
# [2,] 1961   NA   NA
# [3,] 1962   22   NA
# [4,] 1963   NA   NA
# [5,] 1964   NA   NA
# [6,] 1965   NA   NA
# [7,] 1966   24    4
# [8,] 1967   NA   NA
# [9,] 1968   NA   NA
#[10,] 1969   NA   NA
#[11,] 1970   22    4

Count "gaps" between observations in R

Tags:

r

count

Julian

1 Answers

Rich Scriven

Recent Activity

Donate For Us

Count "gaps" between observations in R

Tags:

r

count

Julian

1 Answers

Rich Scriven

Related questions

Recent Activity

Donate For Us