Skip to content

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Mar 8, 2021

What changes were proposed in this pull request?

OriginalType and DecimalMetadata has been marked as @Deprecated in new Parquet code.

Apache Parquet suggest us replace OriginalType with LogicalTypeAnnotation and replace DecimalMetadata with DecimalLogicalTypeAnnotation, so the main change of this pr is clean up these deprecated usages in Parquet related code.

Why are the changes needed?

Cleanup deprecated api usage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the Jenkins or GitHub Action

@LuciferYang LuciferYang changed the title [SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader [WIP][SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader Mar 8, 2021
@LuciferYang LuciferYang marked this pull request as draft March 8, 2021 08:43
@github-actions github-actions bot added the SQL label Mar 8, 2021
@SparkQA
Copy link

SparkQA commented Mar 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40444/

@SparkQA
Copy link

SparkQA commented Mar 8, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40444/

@SparkQA
Copy link

SparkQA commented Mar 8, 2021

Test build #135861 has finished for PR 31776 at commit 34c8d2e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@LuciferYang LuciferYang marked this pull request as ready for review March 9, 2021 02:32
@LuciferYang LuciferYang changed the title [WIP][SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader [SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader Mar 9, 2021
@LuciferYang
Copy link
Contributor Author

cc @HyukjinKwon @dongjoon-hyun

@LuciferYang LuciferYang changed the title [SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader [WIP][SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader Mar 15, 2021
@LuciferYang
Copy link
Contributor Author

LuciferYang commented Mar 15, 2021

Will try to cleanup all deprecated OriginalType usage related parquet code in one pr, so change this pr to draft first.

@LuciferYang LuciferYang marked this pull request as draft March 15, 2021 03:20
@LuciferYang LuciferYang changed the title [WIP][SPARK-34661][SQL] Replaces OriginalType with LogicalTypeAnnotation in VectorizedColumnReader [WIP][SPARK-34661][SQL] Clean up deprecated OriginalType usage in Parquet related code Mar 15, 2021
@SparkQA
Copy link

SparkQA commented Mar 15, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40649/

@LuciferYang LuciferYang changed the title [WIP][SPARK-34661][SQL] Clean up deprecated OriginalType usage in Parquet related code [WIP][SPARK-34661][SQL] Clean up deprecated api usage in Parquet related code Mar 16, 2021
@LuciferYang LuciferYang changed the title [WIP][SPARK-34661][SQL] Clean up deprecated api usage in Parquet related code [SPARK-34661][SQL] Clean up deprecated api usage in Parquet related code Mar 16, 2021
@SparkQA
Copy link

SparkQA commented Mar 31, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41327/

@SparkQA
Copy link

SparkQA commented Mar 31, 2021

Test build #136745 has finished for PR 31776 at commit 9a7ec8c.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM (non-binding). Thanks @LuciferYang .

@LuciferYang
Copy link
Contributor Author

Thanks ~ @sunchao

@LuciferYang
Copy link
Contributor Author

Gentle ping, @wangyum @HyukjinKwon @dongjoon-hyun @maropu

@SparkQA
Copy link

SparkQA commented Apr 26, 2021

Test build #137966 has finished for PR 31776 at commit 9a7ec8c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 28, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42559/

@SparkQA
Copy link

SparkQA commented Apr 28, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42559/

@SparkQA
Copy link

SparkQA commented Apr 28, 2021

Test build #138040 has finished for PR 31776 at commit 51f75b7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42694/

@SparkQA
Copy link

SparkQA commented May 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42694/

@SparkQA
Copy link

SparkQA commented May 5, 2021

Test build #138173 has finished for PR 31776 at commit f91b670.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ResolveHigherOrderFunctions(catalogManager: CatalogManager)
  • case class ApplyFunctionExpression(
  • case class V2Aggregator[BUF <: java.io.Serializable, OUT](
  • trait ExtractValue extends Expression
  • implicit class FunctionIdentifierHelper(ident: FunctionIdentifier)
  • case class AddJarsCommand(paths: Seq[String]) extends LeafRunnableCommand
  • case class AddFilesCommand(paths: Seq[String]) extends LeafRunnableCommand
  • case class AddArchivesCommand(paths: Seq[String]) extends LeafRunnableCommand

@HyukjinKwon
Copy link
Member

I guess it's fine.

private val ParquetDateType = ParquetSchemaType(DATE, INT32, 0, null)
private val ParquetTimestampMicrosType = ParquetSchemaType(TIMESTAMP_MICROS, INT64, 0, null)
private val ParquetTimestampMillisType = ParquetSchemaType(TIMESTAMP_MILLIS, INT64, 0, null)
length: Int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @wangyum FYI

@SparkQA
Copy link

SparkQA commented May 10, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42845/

@SparkQA
Copy link

SparkQA commented May 10, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42845/

@SparkQA
Copy link

SparkQA commented May 10, 2021

Test build #138328 has finished for PR 31776 at commit 2bc0391.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 12, 2021

Unless @wangyum has comments, I can merge to master

@srowen
Copy link
Member

srowen commented May 16, 2021

Merged to master

@srowen srowen closed this in 7ca0a09 May 16, 2021
@LuciferYang
Copy link
Contributor Author

thx ~ @srowen @HyukjinKwon @sunchao

dongjoon-hyun pushed a commit that referenced this pull request May 17, 2021
### What changes were proposed in this pull request?

This fixes the compilation error due to the logical conflicts between #31776 and #29642 .

### Why are the changes needed?

To recover compilation.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Closes #32568 from wangyum/HOT-FIX.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@LuciferYang LuciferYang deleted the cleanup-parquet-dep-api branch June 6, 2022 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants