Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark][Backport 3.2] Fix the semantic of shouldRewriteToBeIcebergCompatible in REORG UPGRADE UNIFORM #3474

Merged

Conversation

xzhseh
Copy link
Contributor

@xzhseh xzhseh commented Aug 5, 2024

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

currently we utilize the helper function
shouldRewriteToBeIcebergCompatible to filter the portion of parquet files that need to be rewritten when running REORG UPGRADE UNIFORM based on the tags in the AddFile.

however, the DeltaUpgradeUniformOperation.icebergCompatVersion is accidentally shadowed, which will make
shouldRewriteToBeIcebergCompatible always return false if the AddFile.tags is not null - this is not the expected semantic of this function.

this PR introduces the fix for this problem and add unit tests to ensure the correctness.

How was this patch tested?

through unit tests in UniFormE2ESuite.scala.

Does this PR introduce any user-facing changes?

no.

…EORG UPGRADE UNIFORM (delta-io#3412)

## Description
currently we utilize the helper function
`shouldRewriteToBeIcebergCompatible` to filter the portion of parquet
files that need to be rewritten when running `REORG UPGRADE UNIFORM`
based on the tags in the `AddFile`.

however, the `DeltaUpgradeUniformOperation.icebergCompatVersion` is
accidentally shadowed, which will make
`shouldRewriteToBeIcebergCompatible` always return `false` if the
`AddFile.tags` is not `null` - this is not the expected semantic of this
function.

this PR introduces the fix for this problem and add unit tests to ensure
the correctness.

## How was this patch tested?
through unit tests in `UniFormE2ESuite.scala`.

## Does this PR introduce _any_ user-facing changes?
no.
@xzhseh xzhseh changed the title [Spark][Backport] Fix the semantic of shouldRewriteToBeIcebergCompatible in REORG UPGRADE UNIFORM [Spark][Backport 3.2] Fix the semantic of shouldRewriteToBeIcebergCompatible in REORG UPGRADE UNIFORM Aug 5, 2024
Copy link
Collaborator

@johanl-db johanl-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamping the backport

@tdas
Copy link
Contributor

tdas commented Aug 5, 2024

these are known python test infra issue failures. since java tests passed, merging this.

@tdas tdas merged commit b1fdee8 into delta-io:branch-3.2 Aug 5, 2024
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants