-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joining tables based on time with different time-zones - no warning #4773
Comments
Thank you for reporting. It could be useful to know how data.frame behaves on that matter as well. |
FWIW, do recall that timezones are somewhat "artificial" in the sense that they are "only" a printing attribute. Once the class of the object is So IINM the join will be done correctly (in that observations at identical points in time [i.e., identical displacements from epoch time] will be matched even if they occurred in different time zones). Coercion / ignoring attributes is a bit more of a general issue for |
The trouble here is that the R> dt_b <- data.table(time = seq.POSIXt(from = ymd_hms("2020-10-20 06:00:00"), to = ymd_hms("2020-10-20 10:00:00"), by = "1 hour"), value = 1:5)
R> dt_b[,time]
[1] "2020-10-20 06:00:00 UTC" "2020-10-20 07:00:00 UTC" "2020-10-20 08:00:00 UTC" "2020-10-20 09:00:00 UTC" "2020-10-20 10:00:00 UTC"
R> as.numeric(dt_b[, time])
[1] 1603173600 1603177200 1603180800 1603184400 1603188000
R> dt_b[, time := lubridate::force_tz(time, tzone = "CET", roll = F)]
R> dt_b[,time]
[1] "2020-10-20 06:00:00 CEST" "2020-10-20 07:00:00 CEST" "2020-10-20 08:00:00 CEST" "2020-10-20 09:00:00 CEST" "2020-10-20 10:00:00 CEST"
R> as.numeric(dt_b[, time])
[1] 1603166400 1603170000 1603173600 1603177200 1603180800
!R> dt_a[dt_b, on = .(time), value_b := i.value]
R> So here the actual Computing with dates and times is tricky. Doubly so in the presence of timezones. Most 'professional' work I have seen in many years settles on one TZ, often UTC or whereever the head office is. |
@jangorecki the data.frame's way result is same as data.table's...so this is not really Interesting problem here, when time is used as string and when as numeric, and nobody knows what is used when...for example during extraction of hour attribute the string is used:
|
I can't understand this issue, frankly speaking. As @eddelbuettel commented, library(data.table)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:data.table':
#>
#> hour, isoweek, mday, minute, month, quarter, second, wday, week,
#> yday, year
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
dt_a <- data.table(
time = seq.POSIXt(from = ymd_hms("2020-10-20 06:00:00", tz = "UTC"),
to = ymd_hms("2020-10-20 10:00:00", tz = "UTC"),
by = "1 hour"),
value = 1:5
)
dt_a
#> time value
#> 1: 2020-10-20 06:00:00 1
#> 2: 2020-10-20 07:00:00 2
#> 3: 2020-10-20 08:00:00 3
#> 4: 2020-10-20 09:00:00 4
#> 5: 2020-10-20 10:00:00 5
dt_b <- copy(dt_a)
dt_b[, time := lubridate::with_tz(time, tzone = "Asia/Shanghai")]
dt_b
#> time value
#> 1: 2020-10-20 14:00:00 1
#> 2: 2020-10-20 15:00:00 2
#> 3: 2020-10-20 16:00:00 3
#> 4: 2020-10-20 17:00:00 4
#> 5: 2020-10-20 18:00:00 5
dt_a[dt_b,
on = .(time),
value_b := i.value]
dt_a
#> time value value_b
#> 1: 2020-10-20 06:00:00 1 1
#> 2: 2020-10-20 07:00:00 2 2
#> 3: 2020-10-20 08:00:00 3 3
#> 4: 2020-10-20 09:00:00 4 4
#> 5: 2020-10-20 10:00:00 5 5 Created on 2020-10-24 by the reprex package (v0.3.0) |
I think we can close this issue. We should not expect posixct to merge to shifted posixct. They should be aligned before merging. |
Hello,
I caught myself at joining two data.tables based on timestamp, but with different time-zones (unfortunately).
The joined values were automatically shifted based on first data.table's time-zone...but without warning - or anything else.
Is it possible to throw warning if something like that is happening?
Example:
Result:
Thanks
The text was updated successfully, but these errors were encountered: