Stack unable to delete ServiceLinkedRoles when upgraded to v1.4.3 #237

silkyroadsilk · 2023-08-15T10:32:13Z

Describe the bug
When updated from version 1.4.1 to 1.4.3 the pipeline errored out in failure to delete existing Service Linked Roles.

2023-08-14 10:38:42.275 | error | toolkit | Stack Deployments Failed: Error: The stack named AWSAccelerator-AccountsStack-123456789-us-east-1 failed to deploy: UPDATE_ROLLBACK_FAILED (The following resource(s) failed to update: [DenyOnSecurityOUsF05B383A, GuardDutyServiceLinkedRoleCreateServiceLinkedRoleResourceD5FE1FBD, DenyOnMigrated7312F37B, SecurityHubServiceLinkedRoleCreateServiceLinkedRoleResource4CC7EFAA, DenyOnProduction26D683DC, DenyOnSandboxD0F93382, DenyOnDevelopmentC81CE8A0]. ): Received response status [FAILED] from custom resource. Message returned: AccessDeniedException: Resource is not in the state functionActive

AWSAccelerator-AccountsStack-1234567891234-us-east-1 |  0/32 | 10:38:23 AM | UPDATE_FAILED        | Custom::CreateServiceLinkedRole | GuardDutyServiceLinkedRole/CreateServiceLinkedRoleResource/Default (GuardDutyServiceLinkedRoleCreateServiceLinkedRoleResourceD5FE1FBD) Received response status [FAILED] from custom resource. Message returned: AccessDeniedException: Resource is not in the state functionActive
    at Object.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/json.js:61:27)
    at Request.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/rest_json.js:61:8)
    at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:686:14)
    at Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)

    at AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)
        at /var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10
        at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)
        at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:688:12) (RequestId: 9ec79f9b-e8d9-49f3-a973-8f6d44b96d2c)
        new CustomResource (/codebuild/output/src2727/src/s3/00/source/node_modules/aws-cdk-lib/core/lib/custom-resource.js:1:823)
        \_ new ServiceLinkedRole (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/constructs/lib/aws-iam/service-linked-role.ts:87:22)
        \_ AccountsStack.createServiceLinkedRole (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/lib/stacks/accelerator-stack.ts:1210:9)
        \_ AccountsStack.createGuardDutyServiceLinkedRole (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/lib/stacks/accelerator-stack.ts:901:12)
        \_ new AccountsStack (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/lib/stacks/accounts-stack.ts:258:14)
        
    \_ main (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/bin/app.ts:543:29)
        \_ processTicksAndRejections (node:internal/process/task_queues:96:5)
        \_ async /codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/bin/app.ts:1017:5
    AWSAccelerator-AccountsStack-1234567891234-us-east-1 |  0/32 | 10:38:23 AM | UPDATE_FAILED        | Custom::CreateServiceLinkedRole | SecurityHubServiceLinkedRole/CreateServiceLinkedRoleResource/Default (SecurityHubServiceLinkedRoleCreateServiceLinkedRoleResource4CC7EFAA) Received response status [FAILED] from custom resource. Message returned: AccessDeniedException: Resource is not in the state functionActive
        at Object.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/json.js:61:27)
        at Request.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/rest_json.js:61:8)
        at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
        at Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
        at Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:686:14)
        at Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)
        at AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)
        at /var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10
        at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)
        at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:688:12) (RequestId: 4d9cd5cd-895c-433a-8444-823324098955)
        new CustomResource (/codebuild/output/src2727/src/s3/00/source/node_modules/aws-cdk-lib/core/lib/custom-resource.js:1:823)
        \_ new ServiceLinkedRole (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/constructs/lib/aws-iam/service-linked-role.ts:87:22)
        \_ AccountsStack.createServiceLinkedRole (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/lib/stacks/accelerator-stack.ts:1226:11)
        \_ AccountsStack.createSecurityHubServiceLinkedRole (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/lib/stacks/accelerator-stack.ts:957:12)
        \_ new AccountsStack (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/lib/stacks/accounts-stack.ts:261:14)
        \_ main (/codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/bin/app.ts:543:29)
        \_ processTicksAndRejections (node:internal/process/task_queues:96:5)
        \_ async /codebuild/output/src2727/src/s3/00/source/packages/@aws-accelerator/accelerator/bin/app.ts:1017:5

To Reproduce
I have tried to re-run the AWSAccelerator-Pipeline after having upgraded landing-zone-accelerator-on-aws to version 1.4.3. In doing so the pipeline was unabled to delete the following roles AWSServiceRoleForSecurityHub', 'AWSServiceRoleForAccessAnalyzer' and 'AWSServiceRoleForAmazonGuardDuty' with the reason AccessDeniedException.

Expected behavior
I expect when the pipeline line is run, that if the roles already exist it will be able to delete the existing and replace with the new.

Additional context
I have also tried to delete a Role by hand in the AWS console and I get the following error:
IAM Access Analyzer is enabled in one or more regions in your AWS organization. Ask your administrator to delete all analyzers in all regions for your organization before attempting to delete this role.
Having seen this message I ensured that no Access Analyzers exist in any region, and tried to delete again after some time. The same error still persists even though there are no access Analyzers.

Here is an extract of the cloudwatch logs

2023-08-15T09:47:14.528Z	5c61f759-c077-40db-90ca-772b14b6cdb6	INFO	[provider-framework] executing user function arn:aws:lambda:us-east-1:123456789123:function:AWSAccelerator-AccountsSt-AccessAnalyzerServiceLin-knBwOCWbfBtn with payload 
{
    "RequestType": "Update",
    "ServiceToken": "arn:aws:lambda:us-east-1:123456789123:function:AWSAccelerator-AccountsSt-AccessAnalyzerServiceLin-HGbb5TyW6yG6",
    "ResponseURL": "...",
    "StackId": "arn:aws:cloudformation:us-east-1:123456789123:stack/AWSAccelerator-AccountsStack-123456789123-us-east-1/f7e419b0-1fef-11ee-847f-1284cfa3114f",
    "RequestId": "e2c90ef4-4b21-4e5f-b43e-1266babd0e9a",
    "LogicalResourceId": "AccessAnalyzerServiceLinkedRoleCreateServiceLinkedRoleResource7C0C5637",
    "PhysicalResourceId": "1ac3cae3-239d-42bb-b4cc-7d68dae0f523",
    "ResourceType": "Custom::CreateServiceLinkedRole",
    "ResourceProperties": {
        "ServiceToken": "arn:aws:lambda:us-east-1:123456789123:function:AWSAccelerator-AccountsSt-AccessAnalyzerServiceLin-HGbb5TyW6yG6",
        "roleName": "AWSServiceRoleForAccessAnalyzer",
        "serviceName": "access-analyzer.amazonaws.com",
        "uuid": "9bf3a309-8b93-4ef4-b772-2d3120e2c7b8"
    },
    "OldResourceProperties": {
        "ServiceToken": "arn:aws:lambda:us-east-1:123456789123:function:AWSAccelerator-AccountsSt-AccessAnalyzerServiceLin-HGbb5TyW6yG6",
        "roleName": "AWSServiceRoleForAccessAnalyzer",
        "serviceName": "access-analyzer.amazonaws.com",
        "uuid": "3cec1420-e77d-4d38-bfa4-6cf13b2c2e01"
    }
}

2023-08-15T09:47:15.692Z	5c61f759-c077-40db-90ca-772b14b6cdb6	INFO	[provider-framework] submit response to cloudformation 
{
    "Status": "FAILED",
    "Reason": "AccessDeniedException: Resource is not in the state functionActive\n    at Object.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/json.js:61:27)\n    at Request.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/rest_json.js:61:8)\n    at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)\n    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)\n    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:686:14)\n    at Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)\n    at AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)\n    at /var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10\n    at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)\n    at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:688:12)",
    "StackId": "arn:aws:cloudformation:us-east-1:123456789123:stack/AWSAccelerator-AccountsStack-123456789123-us-east-1/f7e419b0-1fef-11ee-847f-1284cfa3114f",
    "RequestId": "e2c90ef4-4b21-4e5f-b43e-1266babd0e9a",
    "PhysicalResourceId": "1ac3cae3-239d-42bb-b4cc-7d68dae0f523",
    "LogicalResourceId": "AccessAnalyzerServiceLinkedRoleCreateServiceLinkedRoleResource7C0C5637"
}

The text was updated successfully, but these errors were encountered:

KashifSaadat · 2023-08-15T11:08:12Z

Peeking at the CloudWatch Logs, at a guess it looks like this issue was introduced in v1.4.3 in the following commit: 5854321

The resource properties will show a change every time for the UUID and attempt to delete + recreate the ServiceLinkedRole, which won't work for the GuardDuty, AccessAnalyzer and SecurityHub SLRs when those features are enabled in the solution. There's no PR or comments related to the commit above. Can someone comment and confirm whether this is the cause, and why the UUID change was introduced for SLRs? What is the recommended procedure to recover from this issue, as we cannot progress with any changes to the solution.

Edit: We rolled back to v1.4.2 and the pipeline succeeded. So I suspect it was the change linked above that caused the problem.

atte-hemminki · 2023-08-16T16:48:37Z

We have experienced the same issue in our environment. It seems that pipeline is able to randomly run the Accounts step successfully ( it might take 1 retry of the accounts step, sometimes up to 4 retries).

bo1984 · 2023-08-21T14:54:52Z

Thank you for bringing this to our attention @silkyroadsilk , we're aware of this issue and should be addressing this in our next release. As @atte-hemminki a workaround for this is to retry the stage. As this is already being tracked, I will keep this issue open and update you once this issue has been addressed in a later release.

de-cx-cloud · 2024-04-24T21:02:39Z

I have encountered the same issue consistently across all releases of the pipeline, specifically with the Cloudformation stack in the region us-east-1. The issue lies in the inability of the Cloudformation stack to delete the AWSAccelerator ServiceLinkedRoles. This leads to a situation where I have to manually destroy the stack multiple times until the roles are successfully deleted.

This issue is reproducible with the following specifications:
LZA Version: 1.6.2
Template: TSE-SE

❌ Deployment failed: Error: The stack named AWSAccelerator-AccountsStack-635719067474-us-east-1 failed to deploy: DELETE_FAILED (The following resource(s) failed to delete: [SecurityHubServiceLinkedRoleCreateServiceLinkedRoleResource4CC7EFAA]. ): Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason":"Waiter has timed out"}

alexhaycock · 2024-04-26T18:54:30Z

@de-cx-cloud Seeing the exact same issue as you, multiple retries of that stage and it finally works. We've got another LZA deployed not using the default prefix 'AWSAccelerator' and never really seen this error.

spyoungtech · 2024-05-24T16:41:14Z

I have a similar issue with a custom resource lambda. It just randomly times out, according to CFN. The message returned ("waiter timed out") is obviously part of the framework code, not my lambda itself.

For example, the custom resource lambda does nothing on resource deletions (because it basically always 'retains' the underlying resource). So, it's not clear to me why the lambda is timing out, even in cases where the lambda action is basically a no-op. Retrying several time resolves the issue, but it's really frustrating, especially in stack creations where this error will cause the stack creation to rollback entirely.

itmustbejj · 2024-08-09T01:09:53Z

It's been a year. Is there any movement on this? I waste so much time retrying the Accounts stage because of this error. I had to retry this 4 times before it would finally work today, which is a typical experience with this bug.

gustavo-guerra-compasso · 2024-08-16T16:59:27Z

I have the same problem.

richardkeit · 2024-08-16T19:21:30Z

@itmustbejj , @gustavo-guerra-compasso - what versions are you on?

Providing as much detail os possible can help prioritise, for example posts above say the default prefix don't see this issue

gustavo-guerra-compasso · 2024-08-16T19:53:26Z

I'm using version 1.9.1 I have the same problem that @de-cx-cloud is having. The account stage timeouts sometimes and I have to retry the stage.

mbevc1 · 2024-08-31T09:54:58Z

Similar here with
UPDATE_FAILED | Custom::CreateServiceLinkedRole | MacieServiceLinkedRole/CreateServiceLinkedRoleResource/Default

Using v1.9.2

adielLevyAllcloud · 2024-09-17T11:44:05Z

Same issue
Similar here with
UPDATE_FAILED | Custom::CreateServiceLinkedRole | MacieServiceLinkedRole/CreateServiceLinkedRoleResource/Default

Using v1.9.2

silkyroadsilk added the bug Something isn't working label Aug 15, 2023

silkyroadsilk mentioned this issue Aug 15, 2023

Stack unable to delete ServiceLinkedRoles when upgraded to v1.4.3 #236

Closed

aaronbrighton mentioned this issue Sep 19, 2023

custom_resources: Provider Lambda function is missing lambda:GetFunctionConfiguration aws/aws-cdk#26838

Open

jasoncaoawshc mentioned this issue Sep 19, 2023

fix(custom_resources): Provider Lambda function is missing lambda:GetFunctionConfiguration permission aws/aws-cdk#27204

Closed

jasoncao99 mentioned this issue Oct 12, 2023

fix(custom_resources): Provider Lambda function is missing lambda:GetFunctionConfiguration permission aws/aws-cdk#27524

Closed

mbevc1 mentioned this issue Sep 3, 2024

bug: Accounts pipeline often times out #556

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stack unable to delete ServiceLinkedRoles when upgraded to v1.4.3 #237

Stack unable to delete ServiceLinkedRoles when upgraded to v1.4.3 #237

silkyroadsilk commented Aug 15, 2023

KashifSaadat commented Aug 15, 2023 •

edited

Loading

atte-hemminki commented Aug 16, 2023

bo1984 commented Aug 21, 2023

de-cx-cloud commented Apr 24, 2024

alexhaycock commented Apr 26, 2024 •

edited

Loading

spyoungtech commented May 24, 2024 •

edited

Loading

itmustbejj commented Aug 9, 2024 •

edited

Loading

gustavo-guerra-compasso commented Aug 16, 2024

richardkeit commented Aug 16, 2024

gustavo-guerra-compasso commented Aug 16, 2024

mbevc1 commented Aug 31, 2024 •

edited

Loading

adielLevyAllcloud commented Sep 17, 2024

Stack unable to delete ServiceLinkedRoles when upgraded to v1.4.3 #237

Stack unable to delete ServiceLinkedRoles when upgraded to v1.4.3 #237

Comments

silkyroadsilk commented Aug 15, 2023

KashifSaadat commented Aug 15, 2023 • edited Loading

atte-hemminki commented Aug 16, 2023

bo1984 commented Aug 21, 2023

de-cx-cloud commented Apr 24, 2024

alexhaycock commented Apr 26, 2024 • edited Loading

spyoungtech commented May 24, 2024 • edited Loading

itmustbejj commented Aug 9, 2024 • edited Loading

gustavo-guerra-compasso commented Aug 16, 2024

richardkeit commented Aug 16, 2024

gustavo-guerra-compasso commented Aug 16, 2024

mbevc1 commented Aug 31, 2024 • edited Loading

adielLevyAllcloud commented Sep 17, 2024

KashifSaadat commented Aug 15, 2023 •

edited

Loading

alexhaycock commented Apr 26, 2024 •

edited

Loading

spyoungtech commented May 24, 2024 •

edited

Loading

itmustbejj commented Aug 9, 2024 •

edited

Loading

mbevc1 commented Aug 31, 2024 •

edited

Loading