I know this is a long-standing, deeply embedded issue, but it's something I come up against so regularly, and that I see beginners to R struggle with so regularly, that I'd love to have a satisfactory solution. My google and SO searches have come up empty so far, but please point me in the right direction if this is duplicated elsewhere.
TL;DR: Is there a way to use something like the POSIXct class without a timezone? I generally use tz="UTC" regardless of the actual timezone of the dataset, but it's a messy hack IMO, and I don't particularly like it. What I want is something like tz=NULL, which would behave the same way as UTC, but without actually adding "UTC" as a tzone attribute.
I'll start with an example (there are plenty) of typical timezone issues. Creating an object with POSIXct values:
df <- data.frame( timestamp = as.POSIXct( c( "2018-01-01 03:00:00",
                                             "2018-01-01 12:00:00" ) ),
                  a = 1:2 )
df
#             timestamp a
# 1 2018-01-01 03:00:00 1
# 2 2018-01-01 12:00:00 2
That's all fine, but then I try to convert the timestamps to dates:
df$date <- as.Date( df$timestamp )
df
#             timestamp a       date
# 1 2018-01-01 03:00:00 1 2017-12-31
# 2 2018-01-01 12:00:00 2 2018-01-01
The dates have converted incorrectly, because my computer locale is in Australian Eastern Time, meaning that the numeric values of the timestamps have been shifted by the offset relevant to my locale (in this case -11hrs). We can see this by forcing the timezone to UTC, then comparing the values before and after:
df$timestamp[1]
# [1] "2018-01-01 03:00:00 AEDT"
x <- lubridate::force_tz( df$timestamp[1], "UTC" ); x
# [1] "2018-01-01 03:00:00 UTC"
difftime( df$timestamp[1], x )
# Time difference of -11 hours
That's just one example of the issues cause by timezones. There are others, but I won't go into them here.
I don't want that behaviour, so I need to convince as.POSIXct not to mess with my timestamps. I generally do this by using tz="UTC", which works fine, except that I'm adding information to the data that isn't real. These times are NOT in UTC, I'm just saying that to avoid time-shift issues. It's a hack, and any time I give my data to someone else, they could be forgiven for thinking that the timestamps are in UTC when they're not. To avoid this, I generally add the actual timezone to the object/column name, and hope that anyone I pass my data on to will understand why someone would label an object with a timezone different to the one in the object itself:
df <- data.frame( timestamp.AET = as.POSIXct( c( "2018-01-01 03:00:00",
                                                 "2018-01-01 12:00:00" ),
                                              tz = "UTC" ),
                  a = 1:2 )
df$date <- as.Date( df$timestamp )
df
#         timestamp.AET a       date
# 1 2018-01-01 03:00:00 1 2018-01-01
# 2 2018-01-01 12:00:00 2 2018-01-01
What I really want is a way to use POSIXct without having to specify a timezone. I don't want the times messed with in any way. Do everything as though the values were in UTC, and leave any timezone details like offsets, daylight savings, etc to the user. Just don't pretend they actually ARE in UTC. Here's my ideal:
x <- as.POSIXct( "2018-01-01 03:00:00" ); x
# [1] "2018-01-01 03:00:00"
attr( x, "tzone" )
# [1] NULL
shifted <- lubridate::force_tz( x, "UTC" )
shifted == x
# [1] TRUE
as.numeric( shifted ) == as.numeric( x )
# [1] TRUE
as.Date( x )
# [1] "2018-01-01"
So there's no timezone attribute on the object at all. The date conversion works as one would expect from the printed value. If there are daylight savings time-shifts, or any other locale-specific issues, the user (me or someone else) needs to deal with that themselves.
I believe something similar to this is possible in POSIXlt, but I really don't want to shift to that. chron or another timeseries-oriented package might be another solution, but I think POSIXct is more widely used and accepted, and this seems like something that should be possible within base::. A POSIXct object with tz="UTC" is exactly what I need, I just don't want to have to lie about timezones in order to get it to behave the way I want (and I believe most beginners to R expect).
So what do others do here? Is there an easy way to use POSIXct without a timezone that I've missed? Is there a better work-around than tz="UTC"? Is that what others are doing?
I'm not sure I understand your issue. Having (re-)read your post and ensuing comments, I see your point.
To summarise:
as.POSIXct determines tz from your system. as.Date has default tz = "UTC" for class POSIXct. So unless you're in tz = "UTC", dates may change; the solution is to use tz with Date, or to change the behaviour of as.Date.POSIXct (see update below).
If you don't specify an explicit tz with as.POSIXct, you can simply specify tz = "" with as.Date to enforce a system-specific timezone. 
df <- data.frame(
    timestamp = as.POSIXct(c("2018-01-01 03:00:00", "2018-01-01 12:00:00")),
    a = 1:2)
df$date <- as.Date(df$timestamp, tz = "")
df;
#           timestamp a       date
#1 2018-01-01 03:00:00 1 2018-01-01
#2 2018-01-01 12:00:00 2 2018-01-01
If you do set an explicit tz with as.POSIXct, you can extract tz from the POSIXct object, and pass it on to as.Date
df <- data.frame(
    timestamp = as.POSIXct(c("2018-01-01 03:00:00", "2018-01-01 12:00:00"), tz = "UTC"),
    a = 1:2)
tz <- attr(df$timestamp, "tzone")
tz
#[1] "UTC"
df$date <- as.Date(df$timestamp, tz = tz)
df
#    timestamp a       date
#1 2018-01-01 03:00:00 1 2018-01-01
#2 2018-01-01 12:00:00 2 2018-01-01
There exists a related discussion on Dirk Eddelbuettel's anytime GitHub project site. The discussion turns out somewhat circular, so I'm afraid it does not offer too much in terms of understanding why as.Date.POSIXct does not inherit tz from POSIXct. I would probably call this a base R idiosyncrasy (or as Dirk calls it: "[T]hese are known quirks in Base R").
As for a solution: I would change the behaviour of as.Date.POSIXct rather than the default behaviour of as.POSIXct.
We could simply redefine as.Date.POSIXct to inherit tz from the POSIXct object.
as.Date.POSIXct <- function(x) {
    as.Date(as.POSIXlt(x, tz = attr(x, "tzone")))
}
Then you get consistent results for your sample case:
df <- data.frame(
    timestamp = as.POSIXct(c("2018-01-01 03:00:00", "2018-01-01 12:00:00")),
    a = 1:2)
df$date <- as.Date(df$timestamp)
df
#timestamp a       date
#1 2018-01-01 03:00:00 1 2018-01-01
#2 2018-01-01 12:00:00 2 2018-01-01
You basically want a different default for as.POSIXct than what is provided. You don't really want to modify anything except as.POSIXct.default, which is the function that will eventually handle character values. It wouldn't make much sense to modify as.POSIXct.numeric since that will always be an offset to UCT. The tz argument only determines what format.POSIXct will display. So you can modify the formals list of the one you've been given. Put this in your .Rprofile:
 formals(as.POSIXct.default) <- alist(x=, ...=, tz="UTC")
Then it passes your tests:
> x <- as.POSIXct( "2018-01-01 03:00:00" ); x
[1] "2018-01-01 03:00:00 UTC"
> attr( x, "tzone" )
[1] "UTC"
> shifted <- lubridate::force_tz( x, "UTC" )
> shifted == x
[1] TRUE
> as.numeric( shifted ) == as.numeric( x )
[1] TRUE
> as.Date( x )
[1] "2018-01-01"
The alternative would be to define an entirely new class, but that would require much more extensive efforts.
A further point to make regards teh specification of time zones. With the prevalence of "daylight savings times" it might be more unambiguous during (input when possible) and output to use the %z format:
dtm <- format( Sys.time(), format="%Y-%m-%d %H:%M:%S %z")
#output
format( Sys.time(), format="%Y-%m-%d %H:%M:%S %z")
[1] "2018-07-06 17:18:27 -0700"
 #input and output without the formals change
 as.POSIXct(dtm, format="%Y-%m-%d %H:%M:%S %z")
[1] "2018-07-06 17:21:41 PDT"
 # after the formals change
  as.POSIXct(dtm, format="%Y-%m-%d %H:%M:%S %z")
 [1] "2018-07-07 00:21:41 UTC"
So when tz information is present as an offset, it can be handled correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With