Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling job fix #1222

Merged
merged 1 commit into from
Apr 25, 2024
Merged

Conversation

SofiaSazonova
Copy link
Contributor

Feature or Bugfix

  • Bugfix

Detail

  • ColumnProfilerRunner must be imported from pydeequ.profiles
  • Workaround about missing SPARK_VERSION

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)?
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization?
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features?
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users?
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor

@dlpzx dlpzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that the changes fix the issue! In a separate PR we could upgrade the Glue version we are using to version 4, wdyt?

@SofiaSazonova SofiaSazonova merged commit e15df15 into data-dot-all:main Apr 25, 2024
9 checks passed
@SofiaSazonova
Copy link
Contributor Author

I checked this script with newer versions of Glue, Spark and PyDeequ. It works without adjustments. At least for my test tables)

@SofiaSazonova SofiaSazonova deleted the profiler-fix branch October 3, 2024 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants