Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling jobs failing with ImportError: cannot import name 'ColumnProfilerRunner' from 'awsglue.transforms' (/opt/amazon/lib/python3.6/site-packages/awsglue/transforms/__init__.py) #1216

Closed
dlpzx opened this issue Apr 25, 2024 · 0 comments

Comments

@dlpzx
Copy link
Contributor

dlpzx commented Apr 25, 2024

Describe the bug

When running a profiling job from data.all on a table the glue job fails. In the AWS Console we see the error: ImportError: cannot import name 'ColumnProfilerRunner' from 'awsglue.transforms' (/opt/amazon/lib/python3.6/site-packages/awsglue/transforms/__init__.py)

How to Reproduce

  1. Click on run profiling job on a table
  2. verify errors in AWS Console Glue

Expected behavior

Profiling jobs works smoothly and succeeds

Your project

No response

Screenshots

No response

OS

n/a

Python version

n/a

AWS data.all version

2.3+additional PRs

Additional context

No response

@SofiaSazonova SofiaSazonova self-assigned this Apr 25, 2024
SofiaSazonova added a commit that referenced this issue Apr 25, 2024
### Feature or Bugfix
<!-- please choose -->
- Bugfix

### Detail
- ColumnProfilerRunner must be imported from pydeequ.profiles
-  Workaround about missing SPARK_VERSION

### Relates
- #1216 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Co-authored-by: Sofia Sazonova <sazonova@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants