fix(eks): cluster-resource-handler fails to verify delete operations #26375
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
The recent upgrade to the underlying aws-sdk-js dependency in the custom resource handler caused deletion events to fail. This was due to the response structure differing between sdk versions, and the new response lacked the property the handler was previously targeting.
This change leverages the
httpResponseCode
to follow the same logic as before.Some additional changes that come with this fix:
hello-k8s
chart.There's a broader question I have about the use of external Helm charts in these snapshot tests which do not target specific versions--this causes the snapshot tests, which are supposed to be deterministic, to become indeterministic. Has the EKS team considered owning a few simple helm chart repos which provide us with stable versions (and thus deterministic tests)?
I spent nearly 5 days fighting with the snapshot tests and found that a few tests (eks-cluster-test, bottlerocket, and cluster-imported test) would never use new assets and instead would deploy using the old. If I synthesized the test using
npx cdk -a test/integ.testname.js
and deployed that it would use the new assets. The only way I got past this was to completely delete the existing snapshot directory. I couldn't find references to this behavior anywhere, but I wanted to call it out in case I deleted something that was not replaced with the snapshot update.Remaining work
The unit tests for the cluster and fargate handler were not updated. I took a stab at them but the interface mocks that we have are no longer accurate, and I couldn't get them to mimic this change.
Fixes #26325
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license