Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Adding cookiecutter config and environment variable for datapipelines stacks #582

Merged
merged 8 commits into from
Jul 19, 2023

Conversation

dbalintx
Copy link
Contributor

Workaround for cookiecutter file writes to root filesystem, which occurs during creation of datapipelines.

Feature or Bugfix

  • Feature
  • Bugfix
  • Refactoring

Detail

Relates

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dbalintx dbalintx requested review from dlpzx and noah-paige July 17, 2023 17:32
Copy link
Contributor

@noah-paige noah-paige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing the above changes...

Works for CDK Pipeline creation. Fails for CodePipeline Trunk-based and Gitflow pipelines (i.e. backend/dataall/cdkproxy/stacks/pipeline.py) still with read-only error on root file system.

Believe this is because:

  1. Need to pass env=env_vars for the ddk init and cp ... commands
  2. We are trying to write to ./PIPELINE_REPO/... which is still giving read-only file system errors - to resolve we may need to move the backend/blueprints/data_pipeline_blueprint/... files to under backend/dataall/... and fix the cwd passed to the subprocess.run(...) commands

@dlpzx
Copy link
Contributor

dlpzx commented Jul 18, 2023

Thanks @dbalintx for opening it and @noah-paige for testing!. Yes to point 1) when I tested I had to add the env_vars inside the initialize_repo function in pipeline.py. I am implementing and testing this at the moment. For 2) I am doing some re-structuring and cleanup

@dlpzx
Copy link
Contributor

dlpzx commented Jul 18, 2023

I just tested the Codepipelines trunk type with the latest changes. The CloudFormation stack gets deployed without issue and the content of the repository contains all the necessary files copied and generated from the code

@dlpzx dlpzx mentioned this pull request Jul 18, 2023
@noah-paige noah-paige self-requested a review July 18, 2023 13:07
Copy link
Contributor

@noah-paige noah-paige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the latest changes as well creating all 3 types of pipelines successfully.

The changes look good to me

@dlpzx
Copy link
Contributor

dlpzx commented Jul 18, 2023

Hi @noah-paige thanks for testing it! For @dbalintx and I the Quality gate is failing in the integration tests so we still need to do some changes in the PR

@noah-paige noah-paige self-requested a review July 18, 2023 13:25
'AWS_DEFAULT_REGION': env.region,
'CURRENT_AWS_ACCOUNT': env.AwsAccountId,
'envname': 'pytest',
}, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to return None rather than True for _check_repository... otherwise it will assume CodeCommit Repo exists and not create one causing an assertion error

@noah-paige
Copy link
Contributor

Hi @noah-paige thanks for testing it! For @dbalintx and I the Quality gate is failing in the integration tests so we still need to do some changes in the PR

Left a comment that I think will resolve this... successfully ran integration tests locally and testing in CodePipeline now

@noah-paige
Copy link
Contributor

Testing with latest integration test fixes now

@noah-paige
Copy link
Contributor

noah-paige commented Jul 18, 2023

The latest changes pass the integration tests as expected!

ADD backend/cdkproxymain.py /cdkproxymain.py

RUN mkdir -p dataall/cdkproxy/assets/glueprofilingjob/jars \
&& mkdir -p blueprints/ml_data_pipeline/engine/glue/jars
RUN mkdir -p dataall/cdkproxy/assets/glueprofilingjob/jars
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have noticed an error in the profiling jobs unable to find this jar. I think it is related with this task but we will take it in a separate PR

@dlpzx dlpzx merged commit 64a60f7 into data-dot-all:v1m6m0 Jul 19, 2023
dlpzx added a commit that referenced this pull request Jul 19, 2023
### Feature or Bugfix
Release PR with the following list of features. Refer to each PR for the
details

### Detail
- #498 
- #482 
- #543
- #524 (which also solves #531)
- #532 
- #535 
- #497 
- #515
- #529 
- #562 
- #455 
- #572 
- #567 
- #573 
- #579 
- #578 
- #582 

### Breaking changes - release notes
- ⚠️ IMPORTANT: upgrade to a version >V1.5.0 before upgrading to V1.6 to
avoid deletion of resources in custom resource deletion
- ⚠️ IMPORTANT: requires an update of environments and then datasets
after upgrading. Either using cdk.json parameter
`enable_update_dataall_stacks_in_cicd_pipeline`, waiting for overnight
update stack task, or manually updating first environments and then
datasets
- CloudFront distribution replace for #529 
- Additional EC2 permissions in CDK Synth CodeBuild stage for #543 -->
this can be avoided by upgrading to v1.5.6 before upgrading to v1.6.0
- local development affected by more restrictive pivotRole trust policy


### Relates
V1.6.0 release

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com>
Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com>
Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants