-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IsAdjustedToUtc flag not used while while reading DeltaTable #1598
Comments
Ah, just ran into this myself. I thought UTC timestamps were supported so I did: utc_tz = cs.datetime(time_zone="UTC")
non_utc_tz = cs.datetime() - utc_tz
df = df.with_columns(non_utc_tz.cast(pl.Datetime("us")), utc_tz.cast(pl.Datetime("us", "UTC")), cs.categorical().cast(pl.Utf8)) That adjusted my "ns" timestamp columns and I was hopefully going to leave the UTC timestamp intact. Generally I am saving data with the naive version (localized to whatever time zone the client is using with the TZ info stripped) and a UTC version. My understanding is that at least UTC timestamps are supported, but just not with the delta-rs library currently? |
So, write support is there, it's writing the parquets properly with UTC timestamps. But the flag is not reused in the schema to read it. I can try to take a look at the issue soon, but I want to work one some other stuff first. Also in Polars it temporarily casts always to non UTC timezone, until there is a fix in delta-rs |
# Description - This addresses all our timestamp inconsistencies, where we were reading Primitive:timestamp as a datatetime without UTC, and now we can properly write datetimes with no timezone as columns to Primitive::timestampNtz. - addressing small bug where checkConstraints feature was not set in writerFeatures when you are on table writer version 7. - bumping default protocol to 3,7 - Made the pyarrow writer and reader more flexible so we can write/read a 3,7 table as long as it has the supported features there. - Properly parses timestamps with UTC into pyarrow timestamps with UTC - Added configkey translation to tablefeature inside the Create Operation # Related Issue(s) - closes #1598 - closes #1019 - closes #1777
Environment
Delta-rs version: 0.10.1
Binding: Python
Environment:
Bug
What happened:
When you write a PyArrow table with UTC timezone datetimes to delta table, the timezone information get's removed in the final table schema. See example below:
What you expected to happen:
Maintain UTC information by looking at the IsAdjustedToUTC flag in the parquet file and passing this back into the schema while reading to arrow. The information is there because when you read the partition directly with Polars, the timezone information is read as UTC:
How to reproduce it:
More details:
The text was updated successfully, but these errors were encountered: