Skip to content

Commit

Permalink
Fix jump role manager AWS Organizations API retries
Browse files Browse the repository at this point in the history
## Why?

When multiple accounts are bootstrapped by ADF via changes in the AWS
Oranizations hierarchy, the jump-role-manager could run into rate limits of
the AWS Organizations API.

## What?

This change will ensure that the lambda function will retry more often.
While using exponential back-off and jitter as built-in by boto3 and as
configured in the Step Function retry logic.
  • Loading branch information
sbkok committed Oct 24, 2024
1 parent 74329ab commit 43d7d05
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 1 deletion.
8 changes: 7 additions & 1 deletion src/lambda_codebase/jump_role_manager/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@

from aws_xray_sdk.core import patch_all
import boto3
from botocore.config import Config
from botocore.exceptions import ClientError

# ADF imports
Expand Down Expand Up @@ -79,8 +80,13 @@
/ CHARS_PER_ACCOUNT_ID,
)

BOTO_ORG_CONFIG = Config(
retries={
"max_attempts": 15,
},
)
IAM_CLIENT = boto3.client("iam")
ORGANIZATIONS_CLIENT = boto3.client("organizations")
ORGANIZATIONS_CLIENT = boto3.client("organizations", config=BOTO_ORG_CONFIG)
TAGGING_CLIENT = boto3.client("resourcegroupstaggingapi")
CODEPIPELINE_CLIENT = boto3.client("codepipeline")

Expand Down
5 changes: 5 additions & 0 deletions src/template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -920,6 +920,11 @@ Resources:
"TimeoutSeconds": 300,
"Retry": [
{
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 3,
"BackoffRate": 2,
"MaxAttempts": 10
}, {
"ErrorEquals": [
"Lambda.Unknown",
"Lambda.ServiceException",
Expand Down

0 comments on commit 43d7d05

Please sign in to comment.