-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug / misfeature in POSIXct creation #1619
Comments
I'm able to reproduce this only on Linux (tested only on Ubuntu 14.04). On my Mac, and Windows 10 (run virtually), I get print(dt[4, tzTime])
# [1] "2016-03-23 09:30:00 CDT" The problem (in Linux) seems to come from this line in as.POSIXct(as.POSIXlt(date), tz = tz) + x Specifically, as.POSIXct(as.POSIXlt(dt$date), tz = "America/Chicago")
# On Ubuntu
# [1] "2016-03-02 00:00:00 CST" "2016-03-09 00:00:00 CST"
# [3] "2016-03-16 01:00:00 CDT" "2016-03-23 01:00:00 CDT"
# On Mac
# [1] "2016-03-02 CST" "2016-03-09 CST" "2016-03-16 CDT" "2016-03-23 CDT" This is likely the source of the issue. Not sure how to fix it though. Will have to understand why this difference now. Let me know if I'm completely off point here. Haven't worked much with dates/times. |
Paging @joshuaulrich ... It looks like we mess up a DST correction for no good reason. I can't recall when I last ran into that. Hm. |
That last example also happens in plain base R without as.POSIXct(as.POSIXlt(as.Date("2016-03-02") + (0:3)*7), tz = "America/Chicago") I think @joshuaulrich once concluded to never use dates as |
Well, especially if time zones are important for the task at hand |
@eddelbuettel It was the other way around. If you only have daily data, you should avoid This looks like it may be a much deeper issue. |
My second example does indeed suggest that something may be rotten in the state of Denmark. Somehow I feel I encountered this before but I cannot think of a fix. Then again I pretty religiously stuck with dt[, goodTime := as.POSIXct(paste(date, time))] is as good as it gets here. Hmpf. |
I think I might have it. Looking more closely at
and
I am still not quite sure I understand exactly what they try to say here but it seems that we want flag value different from zero for And indeed: R> plt <- as.POSIXlt(as.Date("2016-03-02") + (0:3)*7, TZ = "America/Chicago")
R> as.POSIXct(plt, tz = "America/Chicago") ## as before
[1] "2016-03-02 00:00:00 CST" "2016-03-09 00:00:00 CST" "2016-03-16 01:00:00 CDT" "2016-03-23 01:00:00 CDT"
R>
R> plt[3]$isdst <- 1 ## now for one of the two 'wrong ones' set the flag
R> as.POSIXct(plt, tz = "America/Chicago")
[1] "2016-03-02 00:00:00 CST" "2016-03-09 00:00:00 CST" "2016-03-16 00:00:00 CDT" "2016-03-23 01:00:00 CDT"
R> Now we get midnight local rather than the DST change. |
So when I set this, things work out: as.POSIXct.ITime <- function (x, tz = "UTC", date = as.Date(Sys.time()), ...) {
if (missing(date) && any(class(tz) %in% c("Date", "IDate", "POSIXt", "dates"))) {
date <- tz
tz <- "UTC"
}
d <- as.POSIXlt(date)
d$isdst <- 1 # maybe on Linux only?
as.POSIXct(d, tz = tz) + x
} New code: Sys.setenv("TZ"="America/Chicago")
library(data.table)
dt <- data.table(date=as.IDate("2016-03-02") + (0:3)*7, # four Wednesdays
time=as.ITime("09:30:00"))
dt[, pTime := as.POSIXct(x=time, date=date)]
dt[, tzTime := as.POSIXct(x=time, date=date, tz="America/Chicago")]
dt[, goodTime := as.POSIXct(paste(date, time))] # tz implicit from R session
as.POSIXct.ITime <- function (x, tz = "UTC", date = as.Date(Sys.time()), ...) {
if (missing(date) && any(class(tz) %in% c("Date", "IDate", "POSIXt", "dates"))) {
date <- tz
tz <- "UTC"
}
d <- as.POSIXlt(date)
d$isdst <- 1 # maybe on Linux only?
as.POSIXct(d, tz = tz) + x
}
## data.table conversion, now with dst correction
dt[, newGoodTime := as.POSIXct(x=time, date=date, tz="America/Chicago")]
print(dt)
## BUT
print(dt[4, pTime]) # how can I have this be local time?
print(dt[4, tzTime]) # how can I have this be correct?
##
print(dt[4, goodTime])
print(dt[4, newGoodTime]) # better! which produces edd@max:~/src/progs/R(master)$ Rscript datatableBug.R
date time pTime tzTime goodTime newGoodTime
1: 2016-03-02 09:30:00 2016-03-02 09:30:00 2016-03-02 09:30:00 2016-03-02 09:30:00 2016-03-02 09:30:00
2: 2016-03-09 09:30:00 2016-03-09 09:30:00 2016-03-09 09:30:00 2016-03-09 09:30:00 2016-03-09 09:30:00
3: 2016-03-16 09:30:00 2016-03-16 09:30:00 2016-03-16 10:30:00 2016-03-16 09:30:00 2016-03-16 09:30:00
4: 2016-03-23 09:30:00 2016-03-23 09:30:00 2016-03-23 10:30:00 2016-03-23 09:30:00 2016-03-23 09:30:00
[1] "2016-03-23 09:30:00 UTC"
[1] "2016-03-23 10:30:00 CDT"
[1] "2016-03-23 09:30:00 CDT"
[1] "2016-03-23 09:30:00 CDT"
edd@max:~/src/progs/R(master)$ |
@eddelbuettel great! Could try this patch for Linux. Thanks! I don't follow the explanation in |
I settled for this now: toPOSIXct <- function(date, time, tz=Sys.getenv("TZ", unset="America/Chicago")) {
lt <- as.POSIXlt(date)
lt$isdst <- -1 # maybe on Linux only?
as.POSIXct(lt, tz=tz) + time
} Can you play on Windows and OS X (where I have access to neither) to see if setting the |
Already tried on OS X. |
I wouldn't change the underlying elements of a |
@arunsrinivasan Yes yes yes -- I mentioned to Matt that maybe we want complement @joshuaulrich Did you catch my post earlier in this thread demonstrating a bug in conversion that happens with base R at least on Linux. It is about avoiding that base R bug... |
@eddelbuettel I was the one who said it was a deeper issue, either in base R or something in the system functions. My comment was that it seems fragile to avoid that bug by creating and converting an invalid datetime object. The safest thing to do is convert the date and time to string, concatenate, then call |
@joshuaulrich Gosh no. Avoid the conversion to/from text at all times. At the time of construction we have valid |
I guess the other part is whether enforcing So there may be a In short, I am really not sure what the specification would be. But the conversion from having the |
The issue is the difference in how that object is coerced to |
I am with on you that for the first part. As for the second part shouldn't |
Also from
Maybe having |
@arunsrinivasan What is your thinking whether this is an R bug for not setting Shall we prepare something for R-devel highlighting inconsistent behaviour across OSs which cannot really be the desired goal? |
@eddelbuettel yes (I believe it's a bug) and yes. Here's the summary: Sys.setenv("TZ"="America/Chicago")
dates = as.Date("2016-03-02") + (0:3)*7 # four Wednesdays
# on OS X and windows 10 -- expected result
as.POSIXct(as.POSIXlt(dates), tz = "America/Chicago")
# [1] "2016-03-02 CST" "2016-03-09 CST" "2016-03-16 CDT" "2016-03-23 CDT"
# On Linux (tested on Ubuntu 14.04) -- not as expected
as.POSIXct(as.POSIXlt(dates), tz = "America/Chicago")
# [1] "2016-03-02 00:00:00 CST" "2016-03-09 00:00:00 CST" "2016-03-16 01:00:00 CDT" "2016-03-23 01:00:00 CDT"
# isdst attribute is identical on OS X / Windows / Ubuntu
lt = as.POSIXlt(dates)
lt$isdst
# [1] 0 0 0 0
# However, setting isdst to -1 on Ubuntu returns expected results
lt$isdst = -1
as.POSIXct(lt, tz = "America/Chicago")
# [1] "2016-03-02 CST" "2016-03-09 CST" "2016-03-16 CDT" "2016-03-23 CDT"
# setting isdst to -1 on OS X / Windows has no effect, i.e., the result is always as expected. |
@arunsrinivasan 100% on board. Do you want to bring this to r-devel? |
@eddelbuettel great. Yes, I'll post this summary on r-devel. |
I can confirm that this is indeed a bug in base R. What day and time is it? Depends on the operating system! init_date <- as.POSIXct("2017-11-01", tz="America/New_York")
init_date
init_date_lt <- as.POSIXlt(init_date)
init_date_lt$mon <- init_date_lt$mon + 1 ## add 1 month
init_date_lt
as.POSIXct(init_date_lt) ## the answer depends on the OS
## [1] "2017-11-30 23:00:00 EST" RHEL 7; R 3.3.3
## [1] "2017-12-01 EST" WINDOWS 7; R 3.6.1 |
thanks @crossxwill, could you please add this to the bugzilla report /linked above? |
I don't have a Bugzilla account. R-Core closed it down for n00bs like me. |
You get one by asking on the mailing list. The step requires a human as the previous setup was outsmarted by spam bots -- see https://www.r-project.org/bugs.html |
Assuming that fixing this bug in base R will not require any extra adjustment in data.table I believe we are safe to close this issue? |
I am using data.table to hold/slide/dice some data which I then combine with data help in xts objects. Those work pretty reliably in local time with their timezone.
Now when comparing some time intervals with what I get in data.table, I think I arrived at a "misfeature".
Here is simple example script:
When I run this, I get the following output:
The
ptime
column looks right, but when I compare to an xts object I am off by hours because ... silently UTC persisted. Not good.The
tzTime
column is correct, but only half the time. After DST we are off an hour. Also not good.The third option works but that is AFAIK not the you document it, and possibly the least performant as we need to reparse from string.
Am I being silly or is this something that needs a nudge and correction?
The text was updated successfully, but these errors were encountered: