-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wr.athena.to_iceberg - Insert query has mismatched column types #2678
Labels
bug
Something isn't working
Comments
Hi @Mroq93 thanks for opening this - looking into it. |
Hello @kukushking, do you have any news or updates regarding the bug that we discussed earlier? |
Just need to change this line:
|
GalVishi
added a commit
to GalVishi/aws-sdk-pandas
that referenced
this issue
Mar 10, 2024
LeonLuttenberger
pushed a commit
that referenced
this issue
Mar 11, 2024
Hi @jaidisido @GalvFionic , |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I try to save several Data Frames to Iceberg table using wr.athena.to_iceberg.
A few incremental savings go without any issues, but after some iteration I am getting error:
TYPE_MISMATCH: Insert query has mismatched column types: Table: [varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, timestamp(6), varchar, varchar, varchar, varchar, varchar, varchar, varchar], Query: [varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, varchar, timestamp(3), varchar, varchar, varchar, varchar, varchar, varchar]. If a data manifest file was generated at 's3://bucket-temp/athena/results/c0c18807-2773-4afc-b95c-580034d960ed-manifest.csv', you may need to manually clean the data from locations specified in the manifest. Athena will not delete data in your account.
I see 2 differences between schemas:
Before saving, I cast Dataframe timestamp columns to the same format to be sure that every timestamp is aligned.
When I print this timestamp, column for every Dataframe column format is the same.
In dtype in wr.athena.to_iceberg , for timestamp column I provide type as timestamp, I can not provide precision - it is not supported.
I am not sure if the matter of different number of columns should be an issue. I guess it was resolved here:
#2616
PS.
The order of columns does matter?
How to Reproduce
Expected behavior
No issues with timestamp precision
No issues when saving DataFrames with different schema(missing or additional columns)
Your project
No response
Screenshots
No response
OS
AWS
Python version
3.9
AWS SDK for pandas version
3.5.2
Additional context
No response
The text was updated successfully, but these errors were encountered: