Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dynamodb): replicas not created on table replacement #13300

Merged
merged 9 commits into from
Mar 5, 2021

Conversation

jogold
Copy link
Contributor

@jogold jogold commented Feb 26, 2021

Process Update events resulting from table replacements.

Include the table name in the physical resource id to receive a Delete
event when the table is replaced. This allows to clean "old" replicas.

Use a managed policy instead of an inline policy for the custom resource.
An update of the description property of a managed policy requires a
replacement. If we use the table name in the description it forces a
managed policy replacement when the table name changes. This way we preserve
permissions to delete old replicas in case of a table replacement: a new
managed policy with permissions for the new table is created during the
update phase and the old managed policy with permissions for the old table is
removed only during the update clean up phase.

The logical ID of the SourceTableAttachedPolicy needs to be updated because
CF doesn't allow to change a resource type.

Closes #12332


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Feb 26, 2021

@github-actions github-actions bot added the @aws-cdk/aws-dynamodb Related to Amazon DynamoDB label Feb 26, 2021
@jogold
Copy link
Contributor Author

jogold commented Feb 26, 2021

Successfully tested the following:

  • Deploy integ.global.js with version on master
  • Redeploy with version in this branch: existing replicas are preserved, inline policies are replaced by managed policies
  • Update partition key from id to new_id and deploy: a new table is created, new managed policies are created, new replicas are created, old replicas are deleted, old managed policies are deleted, old table is deleted
  • Delete stack: successful

@skinny85
Copy link
Contributor

skinny85 commented Mar 2, 2021

Sorry @jogold , I was on vacation. I'll look at the PR this week.

Thanks for the contribution!

Copy link
Contributor

@skinny85 skinny85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments/question, but in general looks great as always @jogold !

Did you go through an entire cycle of 1) creating a global Table with the old code, 2) updating the Table to this new code, to see if nothing breaks, 3) forcing a replacement, and 4) verifying the replacement is handled correctly? If not, could you please? Thanks!

console.log('Update table: %j', data);

if (event.RequestType === 'Create' || event.RequestType === 'Update') {
return { PhysicalResourceId: `${event.ResourceProperties.TableName}-${event.ResourceProperties.Region}` };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually 2 functional changes, correct?

  1. Previously, we returned PhysicalResourceId also for deletes, now we don't.
  2. We used to return event.ResourceProperties.Region as the physical ID, now we return `${event.ResourceProperties.TableName}-${event.ResourceProperties.Region}`.

So, is 1. intentional, or accidental? And what about 2. - what will be the behavior for existing customers of global DB Tables?

Copy link
Contributor Author

@jogold jogold Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For existing users of global DB tables if they update to the new the code the only thing that will happen is the replacement of the inline policies with managed policies. This is because nothing changes on the properties of the custom resource so it won't receive any event (Create/Update/Delete).

Now if later a user decides to remove one or more replica regions, the custom resource will receive a Delete event. If we don't change how the physical resource id is returned for a Delete event then CF will receive a new physical resource id (table-region instead of region). Changing the physical resource id in a Delete event is not allowed by CF and the Delete will fail (from a CF perspective and this also means that it will not wait for isComplete). So yes 1. is intentional. (this explanation is also valid for a table replacement when the old replica gets deleted).

Returning an empty object on delete ensures that the custom resource provider framework uses the physical resource id from the event (should have been done in the first place).

EDIT: it's the framework, not CF, that doesn't allow to change the physical resource id during a Delete event

throw new Error(`DELETE: cannot change the physical resource ID from "${cfnRequest.PhysicalResourceId}" to "${onEventResult.PhysicalResourceId}" during deletion`);

// a replacement. Use the table name in the description to force a managed
// policy replacement when the table name changes. This way we preserve permissions
// to delete old replicas in case of a table replacement.
description: `DynamoDB replication managed policy for table ${sourceTable.tableName}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome trick! I'll have to remember this one to use in other Custom Resources.

I'm just a little worried about the delete order - since this depends on the Table, won't it be deleted first, before the replica?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine because it depends on the role used by onEventHandler and isCompleteHandler, see also #8224 for more context on this SourceTableAttachedPolicy

See also my test #13300 (comment).

@jogold
Copy link
Contributor Author

jogold commented Mar 4, 2021

Did you go through an entire cycle of 1) creating a global Table with the old code, 2) updating the Table to this new code, to see if nothing breaks, 3) forcing a replacement, and 4) verifying the replacement is handled correctly? If not, could you please? Thanks!

Yes, see #13300 (comment)

@mergify mergify bot dismissed skinny85’s stale review March 4, 2021 08:44

Pull request has been modified.

Copy link
Contributor

@skinny85 skinny85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @jogold!

@mergify
Copy link
Contributor

mergify bot commented Mar 5, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: e61f9a7
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify
Copy link
Contributor

mergify bot commented Mar 5, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit c7c424f into aws:master Mar 5, 2021
cornerwings pushed a commit to cornerwings/aws-cdk that referenced this pull request Mar 8, 2021
Process `Update` events resulting from table replacements.

Include the table name in the physical resource id to receive a `Delete`
event when the table is replaced. This allows to clean "old" replicas.

Use a managed policy instead of an inline policy for the custom resource.
An update of the description property of a managed policy requires a
replacement. If we use the table name in the description it forces a
managed policy replacement when the table name changes. This way we preserve
permissions to delete old replicas in case of a table replacement: a new
managed policy with permissions for the new table is created during the
update phase and the old managed policy with permissions for the old table is
removed only during the update clean up phase.

The logical ID of the `SourceTableAttachedPolicy` needs to be updated because
CF doesn't allow to change a resource type.

Closes aws#12332


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-dynamodb Related to Amazon DynamoDB
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(aws-dynamodb): No replica created when new table created via Update event
3 participants