Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R remove rows where an entry is not a date

Tags:

date

r

I have a data frame where one column is (meant to be) a date in the form 00:00:00.0 yyyy-mm-dd. Most of the entries are, but some are not. Is there a way to delete the rows that contain non-dates? Something like (if the column is "DATE")

data <- data[is.Date(DATE)==TRUE,]

For example.

Fruit  Date
apple  00:00:00.0 2005-02-01
pear   00:00:00.0 2006-02-01
orange 00:00:00.0 -8-2-402145
rhino  00:00:00.0 2003-04-21

I want

Fruit  Date
apple  00:00:00.0 2005-02-01
pear   00:00:00.0 2006-02-01
rhino  00:00:00.0 2003-04-21
like image 972
Hugh Avatar asked Sep 06 '25 03:09

Hugh


1 Answers

Following joran's reasoning:

# get the test data
test <- data.frame(
    Fruit=c("apple","pear","orange","rhino"),
    Date=c("00:00:00.0 2005-02-01",
           "00:00:00.0 2006-02-01",
           "00:00:00.0 -8-2-402145",
           "00:00:00.0 2003-04-21")
)

# remove the rows by checking if not (!) an NA due to not meeting the date format
test[!is.na(strptime(test$Date,format="00:00:00.0 %Y-%m-%d")),]

Result:

  Fruit                  Date
1 apple 00:00:00.0 2005-02-01
2  pear 00:00:00.0 2006-02-01
4 rhino 00:00:00.0 2003-04-21
like image 110
thelatemail Avatar answered Sep 07 '25 21:09

thelatemail