I have this function for calculating difference of consecutive date/times in seconds. It is working fine, but I would like to understand why I need the first line:
padded.diff <- function(x) c(0L, diff(x))
df2=within(df, {
date <- strptime(Last.Modified.Date, format="%d.%m.%Y %H:%M:%S")
date.diff <- padded.diff(as.numeric(date))
})
Why does it give me an error in this format?:
df2=within(df, {
date <- strptime(Last.Modified.Date, format="%d.%m.%Y %H:%M:%S")
date.diff <- diff(as.numeric(date))
})
The error is as follows:
Error in `[<-.data.frame`(`*tmp*`, nl, value = list(date.diff = c(3, 56, :
replacement element 1 has 25584 rows, need 25585
If you are taking differences d_i = x_i - x_(i-1) of an n-length input vector, the result will be a vector with length n-1; or more generally, diff(x, lag = k) results in a vector with length equal to length(x)-k. The error message you got,
replacement element 1 has 25584 rows, need 25585
means you were trying to replace a 25585-length vector with only 25584 elements. padded.diff just adds a single integer value (0L, which is pretty conventional) to account for this difference in length. You might consider a more general version of padded.diff though, in case you desire lag > 1:
pad.diff <- function(x, n = 1) c(rep(0L,n), diff(x, lag = n))
##
x <- (1:5)**2
##
R> diff(x)
#[1] 3 5 7 9
##
R> pad.diff(x)
#[1] 0 3 5 7 9
##
R> pad.diff(x, 2)
#[1] 0 0 8 12 16
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With