-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(custom-resources): AwsCustomResource leaks assumed role to other custom resources #15425
Comments
I've been trialling @nicolai-shape's suggestion above for roughly a week now, with great success. 👍 While the fix itself seems pretty straightforward, I guess the complication is how to test it. I've prepared some failing tests to describe the scenario, which then starts passing once the above fix gets applied: theipster#1. Does this look useful / worthy of submitting a formal PR? |
@theipster , for me it looks very useful and this bug is a showstopper for me and also a security issue. So it would be great if you can submit a formal PR. |
Yeh it's not ideal, but I ended up working around this by using multiple nested stacks just to house the lambda/awscustomresource within because the Lambda is stack-scoped and thus it's environmental context stays consistent for role assumption. YMMV of course if you use more discrete principal role names in your target roles. |
I've discovered a related issue when you have one custom resource with multiple SDK calls (onCreate, onUpdate, onDelete). In the example below This module creates a DNS record in a hosted zone in another account, using a cross account role.
In the logs I can see this is because the role is already being used by the lambda, and then it tries to assume itself:
if you remove WorkaroundI have been able to work around this by granting my role permission to assume itself. |
Hey @mattvanstone . So just to confirm, is the target assumeRoleArn the same each time or does it switch between different account ID's? I have a pattern which uses AWS Custom Resources in this fashion too for manipulating remote/spoke account Route53ResolverRule associations via SDK API, however what I found is that this was due to the singleton lambda trying to reissue the API call with the role it had assumed and cached for a different account in its context. As my pattern issues hundreds of calls to the AWSCustomResource within seconds it appeared as though the Lambda wouldn't re-claim a new STS role assumption to the provided target ARN role in the given spoke account each time and would seemingly "reuse" whatever it last used. Given this pattern performs these 100+ lambda calls within a few seconds across dozen+ accounts it would ALWAYS hit this same error Workaround? As per AWS Doco, each AWSCustomResource is a singleton lambda only per stack in which it is generated, all other instantiations are indeed simply invocation calls to the lambda function during CFN operations. LMK what you think |
@julienbonastre In my scenario the target assumeRoleArn is always the same arn, so you would think it would work, but if I trigger an onCreate and then immediately update the stack and trigger an onUpdate or onDelete the lambda is executed again, but the second time it already has the role from the onCreate and then it is passed the assumedRoleArn parameter again and tries to assume it, but the lambad is already assuming it, so it errors if the permissions on that role doesn't allow it to assume itself. Before I solved my issue I looked at the workaround you mentioned but since I'm trying to run different SDK calls on the same resource it wouldn't work for me. Seems like two different issues, but with the same root cause. |
Ultimately, the way I see the situation is quite simply: the Lambda was designed to be a singleton, therefore it should ideally be made stateless or else there'll always be a risk of prior side effects affecting the output - as seen in all of the scenarios mentioned above. Unfortunately, in its current form, the Lambda isn't stateless. State exists in the form of While the workaround using nested stacks might appear to work,
To summarise, making the Lambda stateless will solve the problem while keeping true to the original design. |
…leaked to next execution (#15776) Fixes #15425 Credit to @nicolai-shape for proposing the fix itself. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
…leaked to next execution (aws#15776) Fixes aws#15425 Credit to @nicolai-shape for proposing the fix itself. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
This is still an issue |
The runtime handler of the
AwsCustomResource
does not correctly reset the credentials after executing when given aassumedRoleArn
in any of theAwsSdkCall
objects.This means if you have two
AwsCustomResource
constructs in the same stack, and the first one that is deployed supplies aassumedRoleArn
then the second one will fail to deploy if it executes any commands that are not covered by the policy of the assumed role of the first custom resource.This obviously only happens if the execution context of the Lambda is reused, which is quite likely if you have a dependency between the custom resources so they're not executed concurrently.
Reproduction Steps
What did you expect to happen?
Deployment of both resources should succeed.
What actually happened?
The deployment of the second custom resource fails due to insufficient permissions of the role that was specified for the first custom resource.
Environment
Other
The issue is located here https://github.com/aws/aws-cdk/blob/master/packages/@aws-cdk/custom-resources/lib/aws-custom-resource/runtime/index.ts#L134. I propose to replace lines 134-143 with the following:
Instead of modifying the global AWS SDK config, we only apply it to the temporary service client.
This is 🐛 Bug Report
The text was updated successfully, but these errors were encountered: