Skip to content

Commit c94d386

Browse files
committed
fix(toolkit): CLI tool fails on CloudFormation Throttling
The CDK (particularly, `cdk deploy`) might crash after getting throttled by CloudFormation, after the default configured 6 retries has been reached. This changes the retry configuration of the CloudFormation client (and only that one) to allow up to 10 retries with a backoff base of 1 second. This makes the maximum back-off about 17 minutes, which I hope would be plenty enough even for the 1 TPM calls. This should allow heavily parallel deployments on the same account and region to avoid getting killed by a throttle; but will reduce the responsiveness of the progress UI. Additionaly, configured a custom logger for the SDK, which would log the SDK calls to the console when running in debug mode, allowing the users to gain visibility on more information for troubleshooting purposes. Fixes #5637
1 parent 6166a70 commit c94d386

File tree

1 file changed

+15
-10
lines changed
  • packages/aws-cdk/lib/api/aws-auth

1 file changed

+15
-10
lines changed

packages/aws-cdk/lib/api/aws-auth/sdk.ts

+15-10
Original file line numberDiff line numberDiff line change
@@ -42,29 +42,34 @@ export class SDK implements ISDK {
4242
private readonly config: ConfigurationOptions;
4343

4444
/**
45-
* Default retry options for SDK clients
46-
*
47-
* Biggest bottleneck is CloudFormation, with a 1tps call rate. We want to be
48-
* a little more tenacious than the defaults, and with a little more breathing
49-
* room between calls (defaults are {retries=3, base=100}).
45+
* Default retry options for SDK clients.
46+
*/
47+
private readonly retryOptions = { maxRetries: 6, retryDelayOptions: { base: 300 } };
48+
49+
/**
50+
* The more generous retry policy for CloudFormation, which has a 1 TPM limit on certain APIs,
51+
* which are abundantly used for deployment tracking, ...
5052
*
51-
* I've left this running in a tight loop for an hour and the throttle errors
52-
* haven't escaped the retry mechanism.
53+
* So we're allowing way more retries, but waiting a bit more.
5354
*/
54-
private readonly retryOptions = { maxRetries: 6, retryDelayOptions: { base: 300 }};
55+
private readonly cloudFormationRetryOptions = { maxRetries: 10, retryDelayOptions: { base: 1_000 } };
5556

5657
constructor(private readonly credentials: AWS.Credentials, region: string, httpOptions: ConfigurationOptions = {}) {
5758
this.config = {
5859
...httpOptions,
5960
...this.retryOptions,
6061
credentials,
6162
region,
63+
logger: { log: (...messages) => messages.forEach(m => debug('%s', m)) },
6264
};
6365
this.currentRegion = region;
6466
}
6567

6668
public cloudFormation(): AWS.CloudFormation {
67-
return wrapServiceErrorHandling(new AWS.CloudFormation(this.config));
69+
return wrapServiceErrorHandling(new AWS.CloudFormation({
70+
...this.config,
71+
...this.cloudFormationRetryOptions,
72+
}));
6873
}
6974

7075
public ec2(): AWS.EC2 {
@@ -212,4 +217,4 @@ function allChainedExceptionMessages(e: Error | undefined) {
212217
e = (e as any).originalError;
213218
}
214219
return ret.join(': ');
215-
}
220+
}

0 commit comments

Comments
 (0)