Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 backups #125

Merged
merged 15 commits into from
Aug 8, 2023
Merged

Conversation

maxinelasp
Copy link
Contributor

Change Summary

Per ticket #107, create the CDK system for automatically backing up all files in the S3 data bucket.

Overview

This system is designed to be run in two accounts, although you can run both steps in the same account. You change the CONTEXT variable in the personal app template to change the context to deploy to - the backup account is set to "backup". If you are deploying to the backup account, you then have to change your profile to the backup account credentials.

Also required is setting the "CDK_S3_BACKUPS_SOURCE_ACCOUNT" environment variable to specify what account your source bucket is set up in.

There are two manual steps that I haven't yet figured out.

  1. When you deploy the SdsDataManager stack in the dev or source account, you need to manually input the role name that is created into backup_bucket_stack.py. This name is generated with different characters each time and I haven't figured out what the template is yet to replicate it, or how to pass the ARN across different deploy steps.
  2. You need to manually create the replication rule in the source bucket. This takes about 2 minutes, and I think there is some way to do this in CDK, but I don't want to delay getting this PR out ahead of SIT-2.

If you'd like to test cross-account replication, you can use my data bucket in the backup account (sds-data-mh-backup). Or, you can deploy both pieces to the same account and they should work.

There aren't any dependencies across the new backup bucket stack and the sds-data-manager stack so they can be deployed separately. However, this does mean I make some manual assumptions. I would be open to passing some information back and forth to reduce those dependencies if people have any ideas

New Dependencies

None

New Files

  • backup_bucket_stack.py
    • Creates the backup bucket and sets up permissions using the new role in sds_data_manager.py

Updated Files

  • sds_data_manager.py
    • Created a new role for replication permissions
  • stackbuilder.py
    • added new util function to create backup bucket stacks
  • app_template_dev.py
    • created new flow for accessing backup bucket stack

Testing

Skipped testing for now to get this up to review. I would expect this to just check the permissions on each role. I think checking the actual backups themselves are for integration testing - is that necessary at this point? What do people think?

@maxinelasp
Copy link
Contributor Author

Tests failed because it checks the number of roles and I didn't update it. I will fix now.

@maxinelasp
Copy link
Contributor Author

Here is what I see as potential solutions to not knowing the Role name:

  1. Just figuring out a way to get the Role arn - this is obviously best, but may not be possible. I'll continue looking into this.
  2. Having the backup stack depend on the sds-data-manager stack - this isn't ideal because they otherwise are not really connected. This is beneficial because the backup stack doesn't need to be deployed more than once, and is in an entirely different account, so I don't want to add more dependencies than needed. I also am not sure if it work to deploy to two different accounts and have them connected - I think that's generally something CDK discourages. Also, since the stacks kind of depend on each other (sds-data-manager needs the backup bucket name), it seems like it would cause problems.
  3. Passing via "Outputs" - CDK has a method for outputting information which is then accessible to other stacks. These outputs can be written to a local file and then accessed from there. This would require that the sds-data-stack was run before backup-buckets stack on the same machine, which is kind of annoying, but also fine. This is probably my preferred method, but I could see it introducing issues later with automatic deployments (but the backup account probably won't be deployed as much and maybe we can manually move that output file around.)
  4. writing to the context - I could write the role arn to cdk.json, but that would probably cause problems with git deployments if it ever gets committed by accident or something.
  5. Guessing at the name of the role - this is how I get the bucket name, but that's easier to guess. There is some template out there somewhere in cloudformation which determines this name, but I'm not sure if I could consistently get the correct name. Also, guessing at the name kind of sucks.

Thoughts?


For replication to work, you also need to deploy SdsDataManager and create
the source bucket and replication role. Then, you need to manually update
the role_arn variable with the replication role created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a question on your PR comment about how to add dependency. In CDK, there is a way to create a resource only after its dependent resources are created. For example, something like

backup_bucket.node.add_dependency(sds_data_manager)
backup_bucket.node.add_dependency(source_bucket)
backup_bucket.node.add_depencency(replication_role)

What this does is, it makes sure all the source for the backup_bucket is created before it creates backup_bucket. I hope this helps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look into this, but I don't think it will work, because the sds_data_manager is created in a different account. When deploying the backup bucket, it won't be able to see the sds_data_manager in the dev account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is way to add this dependency at very high level like in app.py but we can look at it post SIT-2.

super().__init__(scope, construct_id, env=env, **kwargs)

# FOR NOW: Deploy other stack, update this name with the created role.
role_arn = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get role_arn, I think you can use this from_role_name() from this link: https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_iam/Role.html#aws_cdk.aws_iam.Role.from_role_name.

Above returns an IRole. Then get role_arn from it?

Eg.

iam_role_by_look_up = iam.Role.from_role_name(self, 'roleLookup', role_name=role_name)
role_arn = iam_role_by_look_up.role_arn

I think that's the syntax. I haven't tested it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that still requires the role name, which is the piece of information I don't know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Because it lives in different account?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the role is in a different account (dev) while this IAM role is in the backup account.

That method actually also requires the role to be in the same account, so even if I did know the name it wouldn't work.

)

# This is the S3 bucket used by upload_api_lambda
backup_bucket = s3.Bucket(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this creating bucket again or is it creating same bucket in the backup account or region? If second case, should we add something to its name that it's a backup bucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is creating a new bucket in the backup account. This does result in the name being sds-data-[initial]-backup, due to the way the sds-id is created. Is that sufficient, or would you like me to change the name of the bucket to something else? Maybe sds-backup-[initial]-backup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. We could add comment saying that this part is creating backup bucket in a different account to minimize confusion because they look exactly same when we look at it first. :)

or add suffix backup. You know how sds_id is set to dev or prod. Will sds_id be set to backup for backup account? If that's the case, then we are good.

A question. If we keep same exact bucket name convention for both source account and backup account, will it not give "bucket name is not unique" error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore my comments. You already do this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the sds_id is set to backup for the backup account, so all the deployed resources have "backup" in the name! Does that all make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. It does now!

Copy link
Contributor

@tech3371 tech3371 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me! Let me know if need anything else.

if not s3_source_account:
raise KeyError(
"No source account is set for the backup deploy."
"Please define the CDK_SOURCE_ACCOUNT environment variable."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be CDK_S3_BACKUPS_SOURCE_ACCOUNT instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks for the catch, I changed the name

@@ -9,6 +9,9 @@
This app is designed to be the dev and production deployment app.
It defaults to a dev deployment via a default `env` value in cdk.json.
To deploy to prod, specify `--context env=prod`.
To deploy to the backup account (only deploys required backup stacks),
specify `--context env=backup`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not related to your PR but wanted to mention that at one point, we need to update this app.py to match app_template_dev.py. Also, I don't know when we use app.py vs app_template_dev.py. I will create ticket for this task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't clear what parts of app.py needed to be updated to match with app_template_dev.py so I left it as is. I'm happy to copy over app_template_dev.py and remove all the initial specific stuff if that's preferable!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me too. I created a ticket for this #127

maxinelasp and others added 4 commits August 8, 2023 15:24
* added snapshot code, upgraded cdk, upgraded opensearch

* added snapshot unit tests

* fixed opensearch stack unit test for new OS version

* updated doc strings

* spelled out opensearch (os) snapshot variables

* fixed unit test

* added documentation for manual opensearch permissions setup

* fixed ruff issues, removed aws4auth from requirements.txt

* fixed formatting issues

* fixed lambda reqs issue

* fixed snapshot bucket name and unit test
@bryan-harter
Copy link
Member

I tried it out and it worked great! I uploaded three files, and saw them replicate into another account. I delete the files on the primary account, and saw they still stayed put in the backup account.

My only comment is that there's a lot of manual steps. It's probably unavoidable, but it would be nice if they were somehow integrated into the code.

@maxinelasp maxinelasp merged commit 43ce7ac into IMAP-Science-Operations-Center:dev Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants