The same question is asked here without a solution, How to avoid fread() importing date info as IDate?
The old question was not very specific and was mixing some other issue alongside, Also it did not contain a reprex that I present here. Hence I am improving the question. Please donot mark it as duplicate :-)
The question is: Whenever I read a csv file containing a date column using data.table:: fread
it changes the class of the Date
column to IDate
. How do I avoid this and keep it as Date format?
library(data.table)
library(magrittr)
dt <- data.table(datecol = seq(Sys.Date(),by = "1 day",length.out = 3))
# let's confirm the format of the column is Date
str(dt)
#> Classes 'data.table' and 'data.frame': 3 obs. of 1 variable:
#> $ datecol: Date, format: "2022-05-10" "2022-05-11" ...
#> - attr(*, ".internal.selfref")=<externalptr>
# Now we write it into a file and read back using fwrite and fread
fwrite(dt,"tmpoutput.csv")
fread("tmpoutput.csv") %>% str
#> Classes 'data.table' and 'data.frame': 3 obs. of 1 variable:
#> $ datecol: IDate, format: "2022-05-10" "2022-05-11" ...
#> - attr(*, ".internal.selfref")=<externalptr>
# as you see the date format changes to IDate
Created on 2022-05-10 by the reprex package (v2.0.1)
This is not a huge problem but it needs one extra line of code everytime after a file is read i.e. dt[,datecol:=as_date(datecol = as_date(datecol)]
so that rbind with a similar DT does not fail.
Is there a simpler way to avoid this, as it is a potential bug reason if we forget to do a type conversion later?
Building on @Wimpel answer, you can simply specify the columns classes using the colClasses
argument :
fread("tmpoutput.csv",colClasses=c(datecol='Date')) %>% str
Classes ‘data.table’ and 'data.frame': 3 obs. of 1 variable:
$ datecol: Date, format: "2022-05-10" "2022-05-11" "2022-05-12"
- attr(*, ".internal.selfref")=<externalptr>
you can create a new class (here: importDate), and refer to that in colClasses argument in fread. This forces the given columns to be read in as Date (and not the default iDate).
setClass("importDate")
# conversion
setAs("character", "importDate", function(from) as.Date(from))
# Now read, use a named vector in colClasses, so only identify the cols you explicitly want to convert to Date
fread("tmpoutput.csv", colClasses = c(datecol = "importDate")) %>% str
# Classes ‘data.table’ and 'data.frame': 3 obs. of 1 variable:
# $ datecol: Date, format: "2022-05-10" "2022-05-11" "2022-05-12"
# - attr(*, ".internal.selfref")=<externalptr>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With