Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write_delta doesn't cast datetime columns to us precision #10154

Closed
2 tasks done
ion-elgreco opened this issue Jul 28, 2023 · 3 comments · Fixed by #10165
Closed
2 tasks done

Write_delta doesn't cast datetime columns to us precision #10154

ion-elgreco opened this issue Jul 28, 2023 · 3 comments · Fixed by #10165
Labels
bug Something isn't working python Related to Python Polars

Comments

@ion-elgreco
Copy link
Contributor

ion-elgreco commented Jul 28, 2023

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

df = pl.from_repr("""┌─────────────────┬───────────────────┬───────┬────────────────────────────────┐
│ source_actor_id ┆ source_channel_id ┆ ident ┆ timestamp                      │
│ ---             ┆ ---               ┆ ---   ┆ ---                            │
│ u32            ┆ i64               ┆ str   ┆ datetime[ns]      │
╞═════════════════╪═══════════════════╪═══════╪════════════════════════════════╡
│ 123456780       ┆ 9876543210        ┆ a:b:c ┆ 2023-03-25 10:56:59.663053  │
│ 803065983       ┆ 2055938745        ┆ x:y:z ┆ 2023-03-25 12:38:18.050545  │
└─────────────────┴───────────────────┴───────┴────────────────────────────────┘""")

df.write_delta('test', mode='append')
Exception: Schema error: Invalid data type for Delta Lake: Timestamp(Nanosecond, None)

Issue description

Delta primitive type timestamp only supports "us" precision.

Expected behavior

Polars to cast all timestamp types into 'us' precision for it to work with delta primitive types.

Installed versions

--------Version info---------
Polars:              0.18.9
Index type:          UInt32
Platform:            Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python:              3.11.4 (main, Jul 16 2023, 15:13:18) [GCC 11.3.0]

----Optional dependencies----
adbc_driver_sqlite:  <not installed>
cloudpickle:         <not installed>
connectorx:          <not installed>
deltalake:           0.10.1
fsspec:              2023.6.0
matplotlib:          <not installed>
numpy:               1.25.1
pandas:              2.0.3
pyarrow:             12.0.1
pydantic:            <not installed>
sqlalchemy:          <not installed>
xlsx2csv:            <not installed>
xlsxwriter:          <not installed>
@ion-elgreco ion-elgreco added bug Something isn't working python Related to Python Polars labels Jul 28, 2023
@ion-elgreco
Copy link
Contributor Author

Actually issue is already tracked in delta-rs, I'll try to fix it there.

delta-io/delta-rs#1467

@ion-elgreco
Copy link
Contributor Author

@stinodego Reopening it actually, I have given it some thought. Polars should not submit a schema override that contains dtypes that are incompatible with the delta primitive types.

@ion-elgreco ion-elgreco reopened this Jul 29, 2023
@ion-elgreco
Copy link
Contributor Author

@stinodego
image
Almost done with a working solution, just need to extend it to Structs as well. I

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant