SSE-S3 for imported datasets + rename imported dataset resources #572

dlpzx · 2023-07-13T09:47:59Z

Feature or Bugfix

Feature
Bugfix

Detail

SSE-S3 for imported datasets

With #515 we added the possibility to introduce the KMS key that encrypts the imported bucket when importing a dataset. This is a required field that cannot be empty. In this PR we allow customers to leave the field empty for the case in which no KMS key is used. We assume that if no key is defined, then the bucket is encrypted using SSE-S3 encryption.

changes in frontend views
changes in definition of RDS column values
added line in migration script to backfill undefined keys with the undefined value for previously imported datasets.

Rename imported dataset resources

In #535 the permissions given to the pivot role were restricted to more specific resources. Most policies where updated to access resource-prefix AWS resources only. For imported datasets some of the data.all created resources in the dataset stack (Glue crawler ....) do not follow this naming pattern. In this PR we update the name of those resources to include the prefix. This way the pivot role will be able to execute actions on them.

Added naming convention pattern
Enforce pattern in dataset import
Added migration script to update the names of existing imported datasets

⚠️ After the migration script is executed, dataset stacks need to be updated for the AWS resources to be updated ⚠️

Testing - in-progress

I tested locally, but now I am also testing in a real deployment:

CICD pipeline completes successfully
KMS RDS info updated as expected
KMS RDS info of new created and imported datasets as expected
CloudFormation stacks of updated stacks of imported and created datasets successful (resources updated)
CloudFormation stacks of new stacks of imported and created datasets successful (resources prefixed)
Crawlers can be run for all dataset cases

Relates

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…KMS keys

noah-paige

looks good

### Feature or Bugfix Release PR with the following list of features. Refer to each PR for the details ### Detail - #498 - #482 - #543 - #524 (which also solves #531) - #532 - #535 - #497 - #515 - #529 - #562 - #455 - #572 - #567 - #573 - #579 - #578 - #582 ### Breaking changes - release notes - ⚠️ IMPORTANT: upgrade to a version >V1.5.0 before upgrading to V1.6 to avoid deletion of resources in custom resource deletion - ⚠️ IMPORTANT: requires an update of environments and then datasets after upgrading. Either using cdk.json parameter `enable_update_dataall_stacks_in_cicd_pipeline`, waiting for overnight update stack task, or manually updating first environments and then datasets - CloudFront distribution replace for #529 - Additional EC2 permissions in CDK Synth CodeBuild stage for #543 --> this can be avoided by upgrading to v1.5.6 before upgrading to v1.6.0 - local development affected by more restrictive pivotRole trust policy ### Relates V1.6.0 release By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: Gezim Musliaj <102723839+gmuslia@users.noreply.github.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: nikpodsh <124577300+nikpodsh@users.noreply.github.com> Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com> Co-authored-by: Nikita Podshivalov <nikpodsh@amazon.com> Co-authored-by: dbalintx <132444646+dbalintx@users.noreply.github.com> Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>

dlpzx added 6 commits July 12, 2023 10:14

Add physicalId condition for v1.6.0 backwards compatibility

f24de29

Force dataset_role as execution role in Dataset crawlers

85e41f7

Force dataset_role as execution role in Dataset crawlers

e455aa6

Allow SSE-S3 encryption option in imported datasets

38e423a

Modify naming convention for dataset resources in imported dataset stac

2a49182

Migration script for renaming imported dataset resources + undefined …

ffac2e8

…KMS keys

dlpzx mentioned this pull request Jul 13, 2023

V1.6.0 features #565

Merged

dlpzx requested a review from noah-paige July 13, 2023 09:53

dlpzx added 4 commits July 13, 2023 11:59

Added check for max length KMS alias

90e9afa

Fix migration script

9a536b4

Force update of Glue resources of dataset

d761b37

Fix migration script - downgrade

f6b52cb

dlpzx marked this pull request as ready for review July 13, 2023 14:51

noah-paige approved these changes Jul 13, 2023

View reviewed changes

dlpzx merged commit 5545a07 into v1m6m0 Jul 14, 2023

dlpzx deleted the feature/v1m6m0-validation-changes branch July 14, 2023 09:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSE-S3 for imported datasets + rename imported dataset resources #572

SSE-S3 for imported datasets + rename imported dataset resources #572

dlpzx commented Jul 13, 2023 •

edited

Loading

noah-paige left a comment

SSE-S3 for imported datasets + rename imported dataset resources #572

SSE-S3 for imported datasets + rename imported dataset resources #572

Conversation

dlpzx commented Jul 13, 2023 • edited Loading

Feature or Bugfix

Detail

SSE-S3 for imported datasets

Rename imported dataset resources

Testing - in-progress

Relates

noah-paige left a comment

Choose a reason for hiding this comment

dlpzx commented Jul 13, 2023 •

edited

Loading