Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade to 3.4 and fix interface changes #178

Closed
wants to merge 2 commits into from
Closed

Conversation

anqini
Copy link

@anqini anqini commented Dec 14, 2023

Note: Can we create a separate branch for this change to merge into? The changes are only compatible with scala-deequ version >=2.0.4 and not backward compatible. It will look fractional if we make it work universally since for the same class both new and old interfaces need to be invoked.

Issue #, if available:

#148

Forward compatibility issue for scala deequ version >=2.0.4.

Description of changes:

Update the invocation code according to the change of the Scala "deequ-2.0.6-spark-3.4" version interfaces

The interface changes include

  • Analyzer class
    • Compliance
    • Histogram
    • MaxLength
    • MinLength
    • Mean
  • Check methods
    • hasMaxLength
    • hasMinLength
    • satisfy

Logic change

  • prioritize StringColumnProfile over StandardColumnProfile

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@chenliu0831
Copy link
Contributor

Note: Can we create a separate branch for this change to merge into? The changes are only compatible with scala-deequ version >=2.0.4 and not backward compatible. It will look fractional if we make it work universally since for the same class both new and old interfaces need to be invoked.

Currently we don't want to maintain separate branch for different version combinations. I agree we shouldn't make the code fractional to achieve the goal and we are working with a long-term plan with changes in Deequ core ideally. I'll share more details around early Jan.

Let's keep this PR open for more ideas.

@anqini
Copy link
Author

anqini commented Dec 15, 2023 via email

@LeandroLTM
Copy link

Hey @anqini @chenliu0831, is there a potential date to merge this PR? Looks like Deequ is already adding support to Spark 3.5

@chenliu0831
Copy link
Contributor

Closing this PR for now - we do not plan to maintain multiple branches and the changes mostly likely have to go to Deequ

@chenliu0831 chenliu0831 closed this Feb 1, 2024
@theopilbeam
Copy link

@chenliu0831 could you explain more the backwards compatibility requirements for pydeequ? Is it strictly required that all future pydeequ versions continue to support all of the currently supported spark & deequ versions, or is there scope for dropping support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants