[SPARK-40245][SQL] Fix FileScan equality check when partition or data filter columns are not read #37693

peter-toth · 2022-08-27T17:13:06Z

What changes were proposed in this pull request?

Unfortunately the fix in #31848 was not correct in all cases. When the partition or data filter contains a column that is not in readSchema() the filter nornalization in FileScan.equals() doesn't work.

Why are the changes needed?

To fix FileScan.equals() to fix reuse issues.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added new UT.

… filter columns are not read

peter-toth · 2022-08-29T06:41:08Z

cc @cloud-fan

cloud-fan · 2022-08-29T07:16:04Z

sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala

+          spark.read.parquet(path.toString).createOrReplaceTempView("t")
+          val df = sql(
+            """
+              |SELECT t1.id, t2.id, t3.id


do we return wrong result before?

No, only exchange reuse didn't happen.

cloud-fan · 2022-08-29T07:27:25Z

LGTM, please fix conflicts

…245-fix-filescan-equals

peter-toth · 2022-08-29T10:38:00Z

LGTM, please fix conflicts

Ok, merged latest master into this PR.

cloud-fan · 2022-08-29T15:53:38Z

thanks, merging to master!

peter-toth · 2022-08-29T15:55:27Z

Thanks for the review!

github-actions bot added the SQL label Aug 27, 2022

[SPARK-40245][SQL] Fix FileScan equality check when partition or data…

9c2a67c

… filter columns are not read

peter-toth force-pushed the SPARK-40245-fix-filescan-equals branch from 1f8aad4 to 9c2a67c Compare August 27, 2022 17:13

cloud-fan reviewed Aug 29, 2022

View reviewed changes

Merge commit '527ddece8fdbe703dcd239401c97ddb2c6122182' into SPARK-40…

e575547

…245-fix-filescan-equals

cloud-fan closed this in 0e8d779 Aug 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-40245][SQL] Fix FileScan equality check when partition or data filter columns are not read #37693

[SPARK-40245][SQL] Fix FileScan equality check when partition or data filter columns are not read #37693

Uh oh!

peter-toth commented Aug 27, 2022

Uh oh!

peter-toth commented Aug 29, 2022

Uh oh!

cloud-fan Aug 29, 2022

Uh oh!

peter-toth Aug 29, 2022 •

edited

Loading

Uh oh!

cloud-fan commented Aug 29, 2022

Uh oh!

peter-toth commented Aug 29, 2022

Uh oh!

cloud-fan commented Aug 29, 2022

Uh oh!

peter-toth commented Aug 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-40245][SQL] Fix FileScan equality check when partition or data filter columns are not read #37693

[SPARK-40245][SQL] Fix FileScan equality check when partition or data filter columns are not read #37693

Uh oh!

Conversation

peter-toth commented Aug 27, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

peter-toth commented Aug 29, 2022

Uh oh!

cloud-fan Aug 29, 2022

Choose a reason for hiding this comment

Uh oh!

peter-toth Aug 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Aug 29, 2022

Uh oh!

peter-toth commented Aug 29, 2022

Uh oh!

cloud-fan commented Aug 29, 2022

Uh oh!

peter-toth commented Aug 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

peter-toth Aug 29, 2022 •

edited

Loading