Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ETL-670] Adjust cleanup job to run on main directory of dev env #126

Merged
merged 3 commits into from
Jul 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/upload-and-deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,6 @@ jobs:
name: Cleanup non-main branch data before integration tests
runs-on: ubuntu-latest
needs: sceptre-deploy-develop
if: github.ref_name != 'main'
environment: develop
# These permissions are needed to interact with GitHub's OIDC Token endpoint.
permissions:
Expand All @@ -277,6 +276,7 @@ jobs:
python_version: ${{ env.PYTHON_VERSION }}

- name: Set namespace for non-default branch
if: github.ref_name != 'main'
run: echo "NAMESPACE=$GITHUB_REF_NAME" >> $GITHUB_ENV

- name: Clean input data bucket
Expand Down
8 changes: 4 additions & 4 deletions src/scripts/manage_artifacts/clean_for_integration_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,6 @@ def delete_objects(bucket_prefix: str, bucket: str) -> None:
# Skip the owner.txt file so it does not need to be re-created
if object_key.endswith("owner.txt"):
continue
elif "main" in object_key:
raise ValueError("Cannot delete objects in the main directory")

s3_client.delete_object(Bucket=bucket, Key=object_key)

Expand All @@ -56,8 +54,10 @@ def main() -> None:
if not args.bucket_prefix or args.bucket_prefix[-1] != "/":
raise ValueError("Bucket prefix must be provided and end with a '/'")

if "main" in args.bucket_prefix:
Copy link
Contributor

@rxu17 rxu17 Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I forgot about us having to remove this if we want to clean up main. Hmm, should we add another safeguard?

if dev not in args.bucket:
   raise ValueError("Must be using buckets in the develop environment")

OR

dev_buckets = ["recover-dev-input-data", "recover-dev-intermediate-data", "recover-dev-processed-data", "recover-dev-cloudformation"]

if args.bucket not in dev_buckets:
   raise ValueError("Must be using buckets in the develop environment")

Overkill?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also be running this on the prod buckets too - But the staging directory of those buckets. I think controlling this in the workflow makes the most sense here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a safeguard for prod/main (we still want to be able to clean up prod/staging). I think this script should be agnostic to the object key, except for the aforementioned case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this guard condition look?

6e36626

Copy link
Contributor

@rxu17 rxu17 Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my brain is not functioning right now. Yes I second what Phil said.
Looks good to me!

raise ValueError("Cannot delete objects in the main directory")
if "main" in args.bucket_prefix and "recover-dev" not in args.bucket:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

raise ValueError(
"Cannot delete objects in the main directory of a non-dev bucket"
)

try:
delete_objects(bucket_prefix=args.bucket_prefix, bucket=args.bucket)
Expand Down
Loading