I have run into some unexpected behaviour in some fairly old code runnning in the latest release of R. The underlying issue seems to be that as.POSIXct does not correctly parse date/time strings in vectors under some circumstances, in this case the presence of a POSIXct value at midnight in an input vector seems to corrupt the output of as.POSIXct
To avoid possible issues with formatting, the example below checks the numeric value of the POSIXct variables not the string values, both being rendering of the underlying datatype. Am I missing something here? Is there some reason for this behaviour like perhaps as.POSIXct not supporting vectors?
Comments below are the output from running the script edited into the script. d1 is the value at midnight, d2 and d3 are different times during that same day.
d1 = as.POSIXct("2025-07-15 00:00:00")
d2 = as.POSIXct("2025-07-15 15:00:00")
d3 = as.POSIXct("2025-07-15 19:00:00")
str(d1)
# POSIXct[1:1], format: "2025-07-15"
str(d2)
# POSIXct[1:1], format: "2025-07-15 15:00:00"
str(d3)
# POSIXct[1:1], format: "2025-07-15 19:00:00"
numericd1 = as.numeric(d1)
numericd2 = as.numeric(d2)
numericd3 = as.numeric(d3)
print(paste(numericd1, numericd2, numericd3))
# [1] "1752501600 1752555600 1752570000"
print("### Mixed Values including midnight result in unexpected values")
# [1] "### Mixed Values including midnight result in unexpected values"
mixedValues = as.numeric(as.POSIXct(c(as.character(d2), as.character(d1), as.character(d3))))
print (mixedValues)
# [1] 1752501600 1752501600 1752501600
if(mixedValues[1]!= numericd2)
print(paste("mismatch", mixedValues[1], numericd2))
# [1] "mismatch 1752501600 1752555600"
if(mixedValues[2]!= numericd1)
print(paste("mismatch", mixedValues[2], numericd1))
if(mixedValues[3]!= numericd3)
print(paste("mismatch", mixedValues[3], numericd3))
# [1] "mismatch 1752501600 1752570000"
print("### Changing the order of the mixed Values does not change the behaviour")
# [1] "### Changing the order of the mixed Values does not change the behaviour"
mixedValues = as.numeric(as.POSIXct(c(as.character(d2), as.character(d3), as.character(d1))))
print (mixedValues)
#[1] 1752501600 1752501600 1752501600
if(mixedValues[1]!= numericd2)
print(paste("mismatch", mixedValues[1], numericd2))
# [1] "mismatch 1752501600 1752555600"
if(mixedValues[2]!= numericd1)
print(paste("mismatch", mixedValues[2], numericd3))
if(mixedValues[3]!= numericd3)
print(paste("mismatch", mixedValues[3], numericd1))
# [1] "mismatch 1752501600 1752501600"
print("### A list of all the same values however converts d2 correctly")
# [1] "### A list of all the same values however converts d2 correctly"
mixedValues = as.numeric(as.POSIXct(c(as.character(d2), as.character(d2), as.character(d2))))
print (mixedValues)
# [1] 1752555600 1752555600 1752555600
if(mixedValues[1]!= numericd2)
print(paste("mismatch", mixedValues[1], numericd2))
if(mixedValues[2]!= numericd2)
print(paste("mismatch", mixedValues[2], numericd1))
if(mixedValues[3]!= numericd2)
print(paste("mismatch", mixedValues[3], numericd3))
print("### A list with a single value produces a correct result")
# [1] "### A list with a single value produces a correct result"
mixedValues = as.numeric(as.POSIXct(c(as.character(d2))))
if(mixedValues != numericd2)
print(paste("mismatch", mixedValues, numericd2))
print("### Accessing the variable directly produces a correct result")
# [1] "### Accessing the variable directly produces a correct result"
mixedValues = as.numeric(as.POSIXct(as.character(d2)))
if(mixedValues != numericd2)
print(paste("mismatch", mixedValues, numericd2))
print("converting POSIXct to POSIXct directly does not change the values")
# [1] "converting POSIXct to POSIXct directly does not change the values"
mixedValues = as.numeric(as.POSIXct(c(d2, d1, d3)))
if(mixedValues[1]!= numericd2)
print(paste("mismatch", mixedValues[1], numericd2))
if(mixedValues[2]!= numericd1)
print(paste("mismatch", mixedValues[2], numericd1))
if(mixedValues[3]!= numericd3)
print(paste("mismatch", mixedValues[3], numericd3))
print("### A list witiout the midnight value however converts d2 correctly")
# [1] "### A list witiout the midnight value however converts d2 correctly"
mixedValues = as.numeric(as.POSIXct(c(as.character(d3), as.character(d2), as.character(d3))))
print (mixedValues)
# [1] 1752570000 1752555600 1752570000
if(mixedValues[1]!= numericd3)
print(paste("mismatch", mixedValues[1], numericd3))
if(mixedValues[2]!= numericd2)
print(paste("mismatch", mixedValues[2], numericd2))
if(mixedValues[3]!= numericd3)
print(paste("mismatch", mixedValues[3], numericd3))
print("### A list with a single value produces a correct result")
mixedValues = as.numeric(as.POSIXct(c(as.character(d2))))
if(mixedValues != numericd2)
print(paste("mismatch", mixedValues, numericd2))
print("### Accessing the variable directly produces a correct result")
mixedValues = as.numeric(as.POSIXct(as.character(d2)))
if(mixedValues != numericd2)
print(paste("mismatch", mixedValues, numericd2))
print("converting POSIXct to POSIXct directly does not change the values")
#
mixedValues = as.numeric(as.POSIXct(c(d2, d1, d3)))
if(mixedValues[1]!= numericd2)
print(paste("mismatch", mixedValues[1], numericd2))
if(mixedValues[2]!= numericd1)
print(paste("mismatch", mixedValues[2], numericd1))
if(mixedValues[3]!= numericd3)
print(paste("mismatch", mixedValues[3], numericd3))
BLUF: your attempt to "compare numeric values" of parsed values falls prey to the below phenomenon, by the time you get to the as.numeric
the damage is done. You cannot rely on round-trip POSIXt
-character
-"anything" equality.
The issue is that beneath it all, as.POSIXlt.character
(yes, lt
, even though you're using ct
) is using its tryFormats
for various candidate formats. The relevant portion of the function:
function (x, tz = "", format, tryFormats = c("%Y-%m-%d %H:%M:%OS",
"%Y/%m/%d %H:%M:%OS", "%Y-%m-%d %H:%M", "%Y/%m/%d %H:%M",
"%Y-%m-%d", "%Y/%m/%d"), optional = FALSE, ...)
{
x <- unclass(x)
if (!missing(format)) {
# ... not relevant here
}
xx <- x[!is.na(x)]
if (!length(xx)) {
# ... not relevant here
} else for (f in tryFormats) if (all(!is.na(strptime(xx, f, tz = tz)))) {
res <- strptime(x, f, tz = tz)
if (nzchar(tz)) attr(res, "tzone") <- tz
return(res)
}
# ...
}
(Side observation: really? calling strptime
twice is okay here? I may submit a PR to remove the double-computation there.)
The key takeaway is the use of all(!is.na(strptime(xx, f, tz=tz)))
.
If you look at your as.character(d#)
variables, one of them is just the date component (due to R's default for midnight-rendering), so the ideal "%Y-%m-%d %H:%M:%S"
will not work.
as.character(d1)
# [1] "2025-07-15"
strptime(as.character(d1), format="%Y-%m-%d %H:%M:%OS")
# [1] NA
Because the function uses all(!is.na(.))
, and the first does not parse, the preferred date+time format is discarded. This hold true for all of the other formats that include %H
since as.character(d1)
will not have it. We end up with "%Y-%m-%d"
, now realizing that strings that include time will pass this step (and the time will be ignored):
as.character(d2)
# [1] "2025-07-15 15:00:00"
strptime(as.character(d2), format="%Y-%m-%d")
# [1] "2025-07-15 EDT"
For the curious, we can see how all three as.character(d#)
variables do against all of the internal tryFormats
(extracted from as.POSIXlt.character
above):
outer(
setNames(nm=c(as.character(d1), as.character(d2), as.character(d3))),
setNames(nm=c("%Y-%m-%d %H:%M:%OS", "%Y/%m/%d %H:%M:%OS", "%Y-%m-%d %H:%M", "%Y/%m/%d %H:%M", "%Y-%m-%d", "%Y/%m/%d")),
function(tm, fmt) Map(strptime, tm, fmt)
)
# %Y-%m-%d %H:%M:%OS %Y/%m/%d %H:%M:%OS %Y-%m-%d %H:%M %Y/%m/%d %H:%M %Y-%m-%d %Y/%m/%d
# 2025-07-15 NA NA NA NA 2025-07-15 NA
# 2025-07-15 15:00:00 2025-07-15 15:00:00 NA 2025-07-15 15:00:00 NA 2025-07-15 NA
# 2025-07-15 19:00:00 2025-07-15 19:00:00 NA 2025-07-15 19:00:00 NA 2025-07-15 NA
Because of the all(!is.na(.))
component, we need to find a column where there are no NA
values, which is associated with the date-only %Y-%m-%d
(5th) column.
You can see more obvious results of this all(!is.na(.))
behavior if you pass a clearly-wrong string:
as.POSIXct(c("quux", as.character(d1), as.character(d2), as.character(d3)))
# Error in as.POSIXlt.character(x, tz, ...) :
# character string is not in a standard unambiguous format
### applying our "outer" view from above, all columns have at least one NA
outer(
setNames(nm=c("quux", as.character(d1), as.character(d2), as.character(d3))),
setNames(nm=c("%Y-%m-%d %H:%M:%OS", "%Y/%m/%d %H:%M:%OS", "%Y-%m-%d %H:%M", "%Y/%m/%d %H:%M", "%Y-%m-%d", "%Y/%m/%d")),
function(tm, fmt) Map(strptime, tm, fmt)
)
# %Y-%m-%d %H:%M:%OS %Y/%m/%d %H:%M:%OS %Y-%m-%d %H:%M %Y/%m/%d %H:%M %Y-%m-%d %Y/%m/%d
# quux NA NA NA NA NA NA
# 2025-07-15 NA NA NA NA 2025-07-15 NA
# 2025-07-15 15:00:00 2025-07-15 15:00:00 NA 2025-07-15 15:00:00 NA 2025-07-15 NA
# 2025-07-15 19:00:00 2025-07-15 19:00:00 NA 2025-07-15 19:00:00 NA 2025-07-15 NA
Because none of the columns are free of NA
values, we get an error.
Suggestion: never rely on R's rendering of POSIXt
objects to be reversible, similar in notion (though not cause) to why you cannot assume that the rendering of pi
is accurate and reversible through as.character
.
I think this has to do with how POSIXct
at midnight like d1
will render as a date without the time component, but when POSIXct receives a vector with (apparent) mixed type, it coerces them to the common date type, thus losing all the time components.
One workaround could be to use as.character2 <- \(x) format(x, "%Y-%m-%d %H:%M:%S")
and replace that where you currently have as.character
.
> as.numeric(as.POSIXct(c(as.character(d1), as.character(d2), as.character(d3))))
[1] 1752562800 1752562800 1752562800
> as.numeric(as.POSIXct(c(as.character2(d1), as.character2(d2), as.character2(d3))))
[1] 1752562800 1752616800 1752631200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With