Skip to content

Conversation

@peter-toth
Copy link
Contributor

What changes were proposed in this pull request?

Unfortunately the fix in #31848 was not correct in all cases. When the partition or data filter contains a column that is not in readSchema() the filter nornalization in FileScan.equals() doesn't work.

Why are the changes needed?

To fix FileScan.equals() to fix reuse issues.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added new UT.

@github-actions github-actions bot added the SQL label Aug 27, 2022
@peter-toth peter-toth force-pushed the SPARK-40245-fix-filescan-equals branch from 1f8aad4 to 9c2a67c Compare August 27, 2022 17:13
@peter-toth
Copy link
Contributor Author

cc @cloud-fan

spark.read.parquet(path.toString).createOrReplaceTempView("t")
val df = sql(
"""
|SELECT t1.id, t2.id, t3.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we return wrong result before?

Copy link
Contributor Author

@peter-toth peter-toth Aug 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only exchange reuse didn't happen.

@cloud-fan
Copy link
Contributor

LGTM, please fix conflicts

@peter-toth
Copy link
Contributor Author

LGTM, please fix conflicts

Ok, merged latest master into this PR.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 0e8d779 Aug 29, 2022
@peter-toth
Copy link
Contributor Author

Thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants