Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to keep default date columns as TIMESTAMP_NTZ in oracle to snowflake migration #1492

Closed
dat-a-man opened this issue Jun 19, 2024 · 2 comments · Fixed by #1669
Closed
Assignees

Comments

@dat-a-man
Copy link
Collaborator

Feature description

Community user : When migrating data from Oracle to Snowflake, I would like the ability to default date columns to TIMESTAMP_NTZ instead of TIMESTAMP_TZ.

Are you a dlt user?

None

Use case

When migrating data from Oracle to Snowflake, the default behavior sets date columns to TIMESTAMP_TZ. In many cases, it is more desirable to use TIMESTAMP_NTZ to avoid timezone-related complications and align with existing data standards in Snowflake.

Proposed solution

  • Provide an option to set the default timestamp type to TIMESTAMP_NTZ during the migration process.
  • Alternatively, allow users to specify the desired timestamp type in the configuration.

Related issues

No response

@rudolfix rudolfix moved this from Todo to Planned in dlt core library Jul 8, 2024
@donotpush donotpush self-assigned this Jul 17, 2024
@rudolfix rudolfix moved this from Planned to In Progress in dlt core library Jul 17, 2024
@rudolfix
Copy link
Collaborator

dlt does not support timestamps without timezones and to be able to pass such timestamps form snowflake to oracle we need to extend the core library first.

  1. We'd need a new hint in TColumnType that will say if data_type contains timezone or not (or define time zone as string, with default to "UTC"). See precision and scale hints already there. The idea is the same.
  2. Each of our destinations has a type mapper which needs to be be able to interpret this new thing.
  3. pyarrow.py contains functions that convert dlt schema into arrow schema and vice versa. those functions needs to be upgraded
  4. we'll need plenty of tests: loading non-tz aware datetimes via json and parquet into all possible destinations
  5. let's make it work for duckdb+postgres+snowflake first

Note: I didn't research how tz-aware timestamps are stored in our destinations. I think timezone is just a metadata flag on the column and values are regular floats...

@donotpush
Copy link
Collaborator

donotpush commented Jul 31, 2024

@dat-a-man I am investigating a solution to change the destination timestamp types.

I have two questions about your case.

(1) Source - Oracle, what are you using for timestamps? TIMESTAMP (timezone not included) or TIMESTAMP WITH TIME ZONE (timezone included)?

(2) Destination - Snowflake, which timezone are you seeing in your replicated data? Is it always +00:00?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants