Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Sep 3, 2016

What changes were proposed in this pull request?

Currently, Spark only supports to infer IntegerType, LongType, DoubleType and StringType.

DecimalType is being tried but it seems it never infers type as DecimalType as DoubleType is being tried first. Also, it seems DateType and TimestampType could be inferred.

As far as I know, it is pretty common to use both for a partition column.

This PR fixes the incorrect DecimalType try and also adds the support for both DateType and TimestampType for inferring partition column type.

How was this patch tested?

Unit tests in ParquetPartitionDiscoverySuite.

@HyukjinKwon
Copy link
Member Author

Some tests might be failed due to #14919.

@SparkQA
Copy link

SparkQA commented Sep 3, 2016

Test build #64893 has finished for PR 14947 at commit b7aa3a3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member Author

@HyukjinKwon HyukjinKwon Sep 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, decimal is not being inferred before.

@HyukjinKwon
Copy link
Member Author

Hi @davies , it seems you made some changes related with this before. Could you please take a look?

@SparkQA
Copy link

SparkQA commented Sep 11, 2016

Test build #65219 has finished for PR 14947 at commit cda9d7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon changed the title [SPARK-17388][SQL] Support for inferring type date/timestamp/decimal for partition column [WIP][SPARK-17388][SQL] Support for inferring type date/timestamp/decimal for partition column Sep 11, 2016
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I should check this requirement. I will update the description soon too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked and I added some more end-to-end tests.

@HyukjinKwon HyukjinKwon changed the title [WIP][SPARK-17388][SQL] Support for inferring type date/timestamp/decimal for partition column [SPARK-17388][SQL] Support for inferring type date/timestamp/decimal for partition column Sep 12, 2016
@SparkQA
Copy link

SparkQA commented Sep 12, 2016

Test build #65234 has finished for PR 14947 at commit e9dea77.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

ping @davies

@HyukjinKwon
Copy link
Member Author

gentle ping @davies

@HyukjinKwon
Copy link
Member Author

@davies I can just remove the decimal change here if you are uncertain of this.

@HyukjinKwon
Copy link
Member Author

ping @davies ..

@davies
Copy link
Contributor

davies commented Oct 16, 2016

LGTM

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Oct 18, 2016

Thanks @davies. (I forgot to push the commits.)

@SparkQA
Copy link

SparkQA commented Oct 18, 2016

Test build #67131 has finished for PR 14947 at commit bd5a63d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented Oct 18, 2016

Merging this into master, thanks!

@asfgit asfgit closed this in 3768653 Oct 18, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
… for partition column

## What changes were proposed in this pull request?

Currently, Spark only supports to infer `IntegerType`, `LongType`, `DoubleType` and `StringType`.

`DecimalType` is being tried but it seems it never infers type as `DecimalType` as `DoubleType` is being tried first. Also, it seems `DateType` and `TimestampType` could be inferred.

As far as I know, it is pretty common to use both for a partition column.

This PR fixes the incorrect `DecimalType` try and also adds the support for both `DateType` and `TimestampType` for inferring partition column type.

## How was this patch tested?

Unit tests in `ParquetPartitionDiscoverySuite`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#14947 from HyukjinKwon/SPARK-17388.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
… for partition column

## What changes were proposed in this pull request?

Currently, Spark only supports to infer `IntegerType`, `LongType`, `DoubleType` and `StringType`.

`DecimalType` is being tried but it seems it never infers type as `DecimalType` as `DoubleType` is being tried first. Also, it seems `DateType` and `TimestampType` could be inferred.

As far as I know, it is pretty common to use both for a partition column.

This PR fixes the incorrect `DecimalType` try and also adds the support for both `DateType` and `TimestampType` for inferring partition column type.

## How was this patch tested?

Unit tests in `ParquetPartitionDiscoverySuite`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#14947 from HyukjinKwon/SPARK-17388.
@HyukjinKwon HyukjinKwon deleted the SPARK-17388 branch January 2, 2018 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants