-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17388][SQL] Support for inferring type date/timestamp/decimal for partition column #14947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some tests might be failed due to #14919. |
|
Test build #64893 has finished for PR 14947 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, decimal is not being inferred before.
|
Hi @davies , it seems you made some changes related with this before. Could you please take a look? |
|
Test build #65219 has finished for PR 14947 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I should check this requirement. I will update the description soon too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked and I added some more end-to-end tests.
|
Test build #65234 has finished for PR 14947 at commit
|
|
ping @davies |
|
gentle ping @davies |
|
@davies I can just remove the decimal change here if you are uncertain of this. |
|
ping @davies .. |
|
LGTM |
|
Thanks @davies. (I forgot to push the commits.) |
e9dea77 to
bd5a63d
Compare
|
Test build #67131 has finished for PR 14947 at commit
|
|
Merging this into master, thanks! |
… for partition column ## What changes were proposed in this pull request? Currently, Spark only supports to infer `IntegerType`, `LongType`, `DoubleType` and `StringType`. `DecimalType` is being tried but it seems it never infers type as `DecimalType` as `DoubleType` is being tried first. Also, it seems `DateType` and `TimestampType` could be inferred. As far as I know, it is pretty common to use both for a partition column. This PR fixes the incorrect `DecimalType` try and also adds the support for both `DateType` and `TimestampType` for inferring partition column type. ## How was this patch tested? Unit tests in `ParquetPartitionDiscoverySuite`. Author: hyukjinkwon <gurwls223@gmail.com> Closes apache#14947 from HyukjinKwon/SPARK-17388.
… for partition column ## What changes were proposed in this pull request? Currently, Spark only supports to infer `IntegerType`, `LongType`, `DoubleType` and `StringType`. `DecimalType` is being tried but it seems it never infers type as `DecimalType` as `DoubleType` is being tried first. Also, it seems `DateType` and `TimestampType` could be inferred. As far as I know, it is pretty common to use both for a partition column. This PR fixes the incorrect `DecimalType` try and also adds the support for both `DateType` and `TimestampType` for inferring partition column type. ## How was this patch tested? Unit tests in `ParquetPartitionDiscoverySuite`. Author: hyukjinkwon <gurwls223@gmail.com> Closes apache#14947 from HyukjinKwon/SPARK-17388.
What changes were proposed in this pull request?
Currently, Spark only supports to infer
IntegerType,LongType,DoubleTypeandStringType.DecimalTypeis being tried but it seems it never infers type asDecimalTypeasDoubleTypeis being tried first. Also, it seemsDateTypeandTimestampTypecould be inferred.As far as I know, it is pretty common to use both for a partition column.
This PR fixes the incorrect
DecimalTypetry and also adds the support for bothDateTypeandTimestampTypefor inferring partition column type.How was this patch tested?
Unit tests in
ParquetPartitionDiscoverySuite.