Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Date difference between end date to start date

I have a data alooks like below.

id   from data   to date
1   2015-03-09   2015-03-14
2   2015-02-22   2015-02-24
2   2015-05-06   2015-05-17
3   2015-02-12   2015-02-16
4   2015-03-10   2015-03-16
4   2015-03-22  2015-04-07
4   2015-06-07  2015-07-07
4   2015-07-06   2015-07-07
4   2015-08-02   2015-08-07

I want to create a seperate variable which is the difference between the to date and the next from date grouped by id. So the first time of the id will be NA.I tried the below method based on the other answer in stackoverflow and I could not achieve that.

library(data.table)
chf1 = data.table(id = chf$id,from date = chf$f.date,to_date = chf$t.date)
setkey(chf1,id)
chf1[,diff:=c(NA,difftime(from_date, to_date, units = "days")),by=id]

The output look like

id   from_date   to_date      difference
1   2015-03-09   2015-03-14     NA
2   2015-02-22   2015-02-24     NA
2   2015-05-06   2015-05-17     71
3   2015-02-12   2015-02-16     NA
4   2015-03-10   2015-03-16     NA
4   2015-03-22  2015-04-07      6
4   2015-06-07  2015-06-10      64
4   2015-07-06   2015-07-07     26
4   2015-08-02   2015-08-07     26
like image 873
ssan Avatar asked Sep 13 '25 04:09

ssan


1 Answers

There are three issues in the code

1) chf1$from_date, chf1$to_date gets the whole column, so there is no effect of grouping by 'id'

2) difftime gives output with the same length as the initial column length.

3) As difftime takes the difference between each element of 'from_date' with corresponding element of 'to_date', there is no need for by = id

Therefore, the code can be

chf1[, diff1:=difftime(from_date, to_date, units = "days")]
chf1
#   id  from_date    to_date    diff1
#1:  1 2015-03-09 2015-03-14  -5 days
##2:  2 2015-02-22 2015-02-24  -2 days
#3:  2 2015-05-06 2015-05-17 -11 days
#4:  3 2015-02-12 2015-02-16  -4 days
#5:  4 2015-03-10 2015-03-16  -6 days
#6:  4 2015-03-22 2015-04-07 -16 days
#7:  4 2015-06-07 2015-07-07 -30 days
#8:  4 2015-07-06 2015-07-07  -1 days
#9:  4 2015-08-02 2015-08-07  -5 days

Based on the description in OP's code, if we need to get the difference between the next value of 'from_date', after grouping by 'id', use the difftime on the shifted 'from_date' with that of 'to_date' and assign (:= it to 'diff1'.

chf1[,  diff1 := difftime(shift(from_date, type = "lead"), to_date, 
                        units = "days") , by = id]
chf1
#  id  from_date    to_date   diff1
#1:  1 2015-03-09 2015-03-14 NA days
#2:  2 2015-02-22 2015-02-24 71 days
#3:  2 2015-05-06 2015-05-17 NA days
#4:  3 2015-02-12 2015-02-16 NA days
#5:  4 2015-03-10 2015-03-16  6 days
#6:  4 2015-03-22 2015-04-07 61 days
#7:  4 2015-06-07 2015-07-07 -1 days
#8:  4 2015-07-06 2015-07-07 26 days
#9:  4 2015-08-02 2015-08-07 NA days

Or it could be

chf1[, diff1 := difftime(from_date, shift(to_date), units = "days"), by = id]

data

chf <- structure(list(id = c(1L, 2L, 2L, 3L, 4L, 4L, 4L, 4L, 4L), 
f.date = structure(c(16503, 
16488, 16561, 16478, 16504, 16516, 16593, 16622, 16649), class = "Date"), 
t.date = structure(c(16508, 16490, 16572, 16482, 16510, 16532, 
16623, 16623, 16654), class = "Date")), .Names = c("id", 
 "f.date", "t.date"), row.names = c(NA, -9L), class = "data.frame")

 chf1 = data.table(id = chf$id,from_date = chf$f.date,to_date = chf$t.date)
like image 126
akrun Avatar answered Sep 15 '25 19:09

akrun