Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(custom-resources): Document possibility of deletion failure when migrating from AwsCustomResource created prior to 1.92.0 #25668

Open
0xdevalias opened this issue May 22, 2023 · 4 comments
Labels
@aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. documentation This is a problem with documentation. p2

Comments

@0xdevalias
Copy link
Contributor

Describe the bug

I had a very old (AWS CDK 1.32.2) custom resource deployed, I updated to the latest 1.x version of AWS CDK (on my way to 2.x) as recommended in the upgrade guide. I removed the old AwsCustomResource from my stack, and replaced it with a construct UserPoolDomainTarget that uses a similar one under the hood.

When I was deploying my stack, I got the following error:

2023-05-22 18:58:44 UTC+1000 | DescribeCognitoUserPoolDomain9D8EB6B4 | DELETE_FAILED | Received response status [FAILED] from custom resource. Message returned: Unexpected token o in JSON at position 1 (RequestId: 3c59ea0a-3fdf-4aef-a0cc-a665da419f4d)

Looking at the logs that back the lambda function running this AwsCustomResource, the following is relevant:

2023-05-22T08:55:34.879Z	852f9a4a-1c59-494d-95dd-bbbf764c6eb4	INFO	{
    "RequestType": "Delete",
    "ServiceToken": "arn:aws:lambda:us-east-1:REDACTED:function:REDACTED-Auth-REDACTED",
    "ResponseURL": "...",
    "StackId": "arn:aws:cloudformation:us-east-1:REDACTED:stack/REDACTED-Auth/0798a8c0-9f1b-11ea-8c14-0e09773e6f3f",
    "RequestId": "c4462bea-a5f3-4ccd-81af-402f09e47c63",
    "LogicalResourceId": "DescribeCognitoUserPoolDomain9D8EB6B4",
    "PhysicalResourceId": "auth.dev.REDACTED",
    "ResourceType": "Custom::DescribeCognitoUserPoolDomain",
    "ResourceProperties": {
        "ServiceToken": "arn:aws:lambda:us-east-1:REDACTED:function:REMDR-Auth-REDACTED",
        "Create": {
            "physicalResourceId": {
                "id": "auth.dev.REDACTED"
            },
            "service": "CognitoIdentityServiceProvider",
            "action": "describeUserPoolDomain",
            "region": "us-east-1",
            "parameters": {
                "Domain": "auth.dev.REDACTED"
            }
        }
    }
}

2023-05-22T08:55:34.879Z	852f9a4a-1c59-494d-95dd-bbbf764c6eb4	INFO	AWS SDK VERSION: 2.1381.0

2023-05-22T08:55:34.893Z	852f9a4a-1c59-494d-95dd-bbbf764c6eb4	INFO	SyntaxError: Unexpected token o in JSON at position 1
    at JSON.parse (<anonymous>)
    at decodeCall (/var/task/index.js:240:17)
    at Runtime.handler (/var/task/index.js:125:43)
    at Runtime.handleOnceNonStreaming (/var/runtime/Runtime.js:74:25)

Note that the ResourceProperties.Create is NOT JSON.stringify'd in the above logs, as this was deployed by a very old version of AWS CDK.

Looking at the code for the lambda function:

The handler first tries to decode the event.ResourceProperties.* data, yet because decodeCall is expecting JSON.stringify'd data, it crashes:

        console.log(JSON.stringify({ ...event, ResponseURL: '...' }));
        console.log('AWS SDK VERSION: ' + AWS.VERSION);
        event.ResourceProperties.Create = decodeCall(event.ResourceProperties.Create);
        event.ResourceProperties.Update = decodeCall(event.ResourceProperties.Update);
        event.ResourceProperties.Delete = decodeCall(event.ResourceProperties.Delete);
function decodeCall(call) {
    if (!call) {
        return undefined;
    }
    return JSON.parse(call);
}

To fix this, decodeCall should be updated to first check if the data is an object/string, and only try and decode it if it's a string. It might also make sense to add some try/catch logic here so that end users get better logs about what went wrong and why (or even better, if it gracefully recovers from errors like this in future)

Expected Behavior

The lambda code backing the AwsCustomResource would correctly handle both legacy non-JSON.strigify'd code, as well as the modern JSON.stringify'd version without crashing.

Current Behavior

2023-05-22 18:58:44 UTC+1000 | DescribeCognitoUserPoolDomain9D8EB6B4 | DELETE_FAILED | Received response status [FAILED] from custom resource. Message returned: Unexpected token o in JSON at position 1 (RequestId: 3c59ea0a-3fdf-4aef-a0cc-a665da419f4d)

Reproduction Steps

See description in 'Describe the bug' above.

Possible Solution

See description in 'Describe the bug' above.

Additional Information/Context

See description in 'Describe the bug' above.

CDK CLI Version

1.201.0 (build 2ca8e92)

Framework Version

1.201.0

Node.js Version

v16.15.1

OS

macOS Ventura 13.3.1

Language

Typescript

Language Version

No response

Other information

See description in 'Describe the bug' above.

@0xdevalias 0xdevalias added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 22, 2023
@github-actions github-actions bot added the @aws-cdk/custom-resources Related to AWS CDK Custom Resources label May 22, 2023
@0xdevalias
Copy link
Contributor Author

0xdevalias commented May 22, 2023

Doing a little digging, I suspect this may have been introduced with this change introduced in 1.92.0 (2021-03-06):

@peterwoodworth
Copy link
Contributor

I'm not sure there's much we can do, this should have been hidden behind a feature flag.

Check out this thread, is the information here able to unblock you? If you can find a way to move along with deployment while ignoring the custom resource error, your stack should still be in good shape. You just might have to manually delete whatever was initially created with your custom resource

@peterwoodworth peterwoodworth added p2 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels May 23, 2023
@0xdevalias
Copy link
Contributor Author

Check out #13486, is the information here able to unblock you?

Not sure how I missed that thread when searching!

If you can find a way to move along with deployment while ignoring the custom resource error, your stack should still be in good shape. You just might have to manually delete whatever was initially created with your custom resource

Yeah, I seemed to be able to get past it, I think since I didn't actually have anything in the 'on delete' of the custom resource it just sort of 'fixed itself up'; but wanted to raise it here for others who may run into the issue.

I'm not sure there's much we can do

It might be overkill given CDK 1.x is just about EOL. And I'm not sure if it would work around the problem as I suspect it still may trigger the existing custom resource when trying to deploy a new version of it; but my thought was basically that the 'proper fix' at the framework level would be to fix the code in the custom resource lambda handler so that it can handle both the old way and the new way (if it's already an object, don't try and deserialise it; otherwise if it's a string, call JSON.parse)

Either way, not really a blocker for me anymore, but for the sanity of future users upgrading in the path of 'latest 1.x version before 2.x' as recommended in the CDK upgrade guide, could be worth trying to add a fix, or at least add it to some known issues caveats/document somewhere around the upgrade guide.


Reference from other thread that may help others:

Unfortunately that was not possible. The CloudFormation stack was stuck for a while trying to delete the resource(s) and then just deleted the resource reference from the stack. The underlying resources where still there, in my case ACM certs and custom resource lambdas. I had to delete them manually. So, the worst thing is that you end up with a stack that deletes the custom resources from the stack, but the physical resources stay.

Originally posted by @dirknilius in #13486 (comment)

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label May 24, 2023
@peterwoodworth peterwoodworth added the documentation This is a problem with documentation. label May 24, 2023
@peterwoodworth
Copy link
Contributor

Sure, we can see about adding something about this to the devguide. Thanks!

@peterwoodworth peterwoodworth changed the title (custom-resources): Unexpected token o in JSON at position 1 (custom-resources): Document possibility of deletion failure when migrating from AwsCustomResource created prior to 1.92.0 May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. documentation This is a problem with documentation. p2
Projects
None yet
Development

No branches or pull requests

2 participants