-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Note: I'll make a PR, it's a small change
Description:
Introduced by my last PR #3919
sam deploy is stuck in describe_stack_events loop without catching events because the stack already finished deploying.
Happens when cloudformation throttle because there are many deploys in parralel.
Steps to reproduce:
Use a small stack like:
SomeTopic:
Type: AWS::SNS::Topic
Properties:
Subscription:
- Protocol: email
Endpoint: emailh@example.comThis will take like ~5 seconds to deploy. And if deployed alone, there is no problem.
But if you have 50+ stacks being deployed at the same time (in CI pipelines) with env AWS_MAX_ATTEMPTS: 30 to retry a lot, then there will be a lot of calls to DescribeStackEvents and cloudformation will throttle.
So just after execute_changeset, the function get_last_event_time might take a few seconds and return the wrong time.
Example timestamp output of wait_for_execute:
No parallel deploy - no throttle - 0s time diff
Thu, 21 Jun 2022 14:24:32 GMT 2022-06-21 14:24:32 - Waiting for stack create/update to complete
Thu, 21 Jun 2022 14:24:32 GMT CloudFormation events from stack operations (refresh every 0.5 seconds)
Many parallel deploy - throttle - 11s diff
Mon, 20 Jun 2022 14:09:28 GMT 2022-06-20 14:09:28 - Waiting for stack create/update to complete
Mon, 20 Jun 2022 14:09:39 GMT CloudFormation events from stack operations (refresh every 0.5 seconds)
Too late, stack is already deployed, events are missed and because I've removed describe_stack from the loop, it is infinite.
Solution
Instead of using get_last_event_time we should rather rely on the executed_changset time to be sure not to skip any event in case of throttle or network latency.
Additional environment details (Ex: Windows, Mac, Amazon Linux etc)
sam --version: v1.52.0