Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the Date Difference in Data.table

I'd like to know how to get the date difference of two column in data.table using the lapply in data.table?

library(data.table)
  dt <- fread(" ID          Date        ME_Mes     DOB
 A     2017-02-20  0.0000 2016-08-19
             B      2017-02-06  2.3030 2016-03-11
             C     2017-03-20  0.4135 2016-08-19
             D      2017-03-06  0.0480 2016-10-09
             E     2017-04-20  2.4445 2016-05-04")
> dt
   ID       Date ME_Mes        DOB
1:  A 2017-02-20 0.0000 2016-08-19
2:  B 2017-02-06 2.3030 2016-03-11
3:  C 2017-03-20 0.4135 2016-08-19
4:  D 2017-03-06 0.0480 2016-10-09
5:  E 2017-04-20 2.4445 2016-05-04

###I'd like to calculate the difference in weeks for every ID by comparing the DOB-Date. 

I tired the following:

dt[,lapply(.SD, diff.Date), .SDcols = c(4,2), ID] # but did not work!
like image 794
Daniel Avatar asked Oct 26 '25 07:10

Daniel


2 Answers

You can use difftime to get the difference in weeks. Although, you need to convert your columns to POSIXct.

In case you want to keep class of your columns as they are, this works:

dt[, "DOB_Date" := difftime(strptime(dt$Date, format = "%Y-%m-%d"),
                        strptime(dt$DOB,  format = "%Y-%m-%d"), units = "weeks")]

dt
##    ID       Date ME_Mes        DOB       DOB_Date
## 1:  A 2017-02-20 0.0000 2016-08-19 26.43452 weeks
## 2:  B 2017-02-06 2.3030 2016-03-11 47.42857 weeks
## 3:  C 2017-03-20 0.4135 2016-08-19 30.42857 weeks
## 4:  D 2017-03-06 0.0480 2016-10-09 21.14881 weeks
## 5:  E 2017-04-20 2.4445 2016-05-04 50.14286 weeks

However, as @Frank suggested it's better to convert ("overwrite") your date-columns to POSIXct class first.

like image 109
M-- Avatar answered Oct 27 '25 21:10

M--


My hunch (and I will let others correct me) is that the following is faster on large datasets:

dt[,Date:=as.Date(Date)]
dt[,DOB:=as.Date(DOB)]
dt[,datediff:=as.integer(Date)-as.integer(DOB)]

datediff will contain date differences in days.

If you have a truly large data.table, you may consider fastPOSIXct from fasttime for string conversion.

like image 24
BBB Avatar answered Oct 27 '25 22:10

BBB