Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Fixed casting of utf8 <> Timestamp with and without timezone #376

Merged
merged 2 commits into from
Sep 5, 2021

Conversation

jorgecarleitao
Copy link
Owner

@jorgecarleitao jorgecarleitao commented Sep 3, 2021

This PR fixes multiple errors in cast timestamps with timezones to and from utf8. It also systematizes the rules.

This PR adds a new optional dependency, chrono-tz, that is used for calculations with timezones other than "+WX:YZ" and "UTC". Those timezones have offsets that vary in time (e.g. DST), which chrono-tz solves. When the feature is not active and the timezone is e.g. "London", the cast errors.

Utf8 in RFC3339 <-> Timestamp with timezone

These casts are lossless. Strings with different timezones are mapped accordingly to the target timezone via offsets (handled by chrono and chrono-tz).

Utf8 without timezone -> Timestamp with timezone

These casts result in nulls (on an element-by-element basis). There is no unique way of converting these. Users need to perform two casts to add a timezone to a string without a timezone: Utf8 -> Timestamp(_, None) -> Timestamp(_, Some(tz)), where the latter is O(1) and is where the user is explicit about the assumption that the time is in that timezone.

Utf8 -> Timestamp without timezone

These casts are null iff the string is not in RFC3339. Other formats should be handled by temporal_conversions::utf8_to_timestamp_ns and temporal_conversions::utf8_to_naive_timestamp_ns, whose one of the arguments is the format (e.g. "%Y-%m-%dT%H:%M:%S%.f%:z" for RFC3339).

Timestamp without timezone -> Utf8

It is printed in the ISO 8601 format without timezone information (e.g. 1996-12-19 16:39:57), which is the display offered by chrono::NaiveDateTime.

Displaying timestamps follows the same rules as timestamp -> utf8

Overall, this offers a more controlled environment over timezones.

@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Sep 3, 2021
@codecov
Copy link

codecov bot commented Sep 3, 2021

Codecov Report

Merging #376 (7663978) into main (3af5ffe) will decrease coverage by 0.03%.
The diff coverage is 81.53%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #376      +/-   ##
==========================================
- Coverage   81.18%   81.15%   -0.04%     
==========================================
  Files         330      330              
  Lines       21823    21915      +92     
==========================================
+ Hits        17717    17785      +68     
- Misses       4106     4130      +24     
Impacted Files Coverage Δ
src/compute/cast/primitive_to.rs 82.55% <53.57%> (-17.45%) ⬇️
src/temporal_conversions.rs 82.17% <82.97%> (+6.79%) ⬆️
src/compute/cast/mod.rs 88.67% <91.66%> (+<0.01%) ⬆️
src/compute/cast/utf8_to.rs 92.98% <100.00%> (+0.38%) ⬆️
tests/it/compute/cast.rs 99.30% <100.00%> (+0.07%) ⬆️
tests/it/temporal_conversions.rs 100.00% <100.00%> (ø)
src/io/json_integration/write.rs 6.25% <0.00%> (+6.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3af5ffe...7663978. Read the comment docs.

@jorgecarleitao jorgecarleitao changed the title Fixed casting of utf8 to Timestamp with and without timezone Fixed casting of utf8 <> Timestamp with and without timezone Sep 4, 2021
@jorgecarleitao jorgecarleitao force-pushed the fix_tz branch 6 times, most recently from 3a41d45 to 7d04efd Compare September 4, 2021 08:12
@jorgecarleitao jorgecarleitao marked this pull request as ready for review September 4, 2021 08:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant