-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kinesis concurrent CreateStream backoff #1339
Kinesis concurrent CreateStream backoff #1339
Conversation
Rebased. |
@radeksimko any updates on getting this merged? |
Bump. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Josh,
sorry for the delay in reviewing this and sorry for the inconvenience caused by this bug (missing retry logic).
In principle it looks good, I left you some comments related to conventions around retries.
We tend to leverage our retry helpers throughout most official provider codebases, esp. AWS.
This otherwise looks like a fairly small PR we can ship fairly quickly once those comments are addressed. 😄
btw. Thanks for taking the time to cobble the test. 👍
aws/resource_aws_kinesis_stream.go
Outdated
|
||
// How long to sleep if a limit-exceeded event happens | ||
const KINESIS_LIMIT_EXCEEDED_SLEEP = 10 * time.Second | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a nitpick, but these constants are only ever going to be used in this file in a single place, so we may as well just hard code those values and avoid the extra 5 LOC.
aws/resource_aws_kinesis_stream.go
Outdated
return err | ||
} | ||
attemptCount := 1 | ||
for attemptCount <= KINESIS_MAX_ATTEMPTS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of (re)building a complex retry logic, counting attempts and time elapsed I think we should use the helper from helper/resource
i.e.
err := resource.Retry(30 * time.Second, func() *resource.RetryableError {
_, err := conn.CreateStream(createOpts)
if isAWSErr(err, "LimitExceededException", "simultaneously be in CREATING or DELETING") {
return resource.RetryableError(err)
}
return resource.NonRetryableError(err)
})
so we could reduce ~20 LOC to (IMO) more readable ~7 lines.
aws/resource_aws_kinesis_stream.go
Outdated
// Non-AWS exception occurred, give up | ||
return fmt.Errorf("Error creating Kinesis stream: %s", err) | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we return
on errors above I don't think the else block is necessary in this context. It can be just below any retry logic, just assuming we'll only ever get to it if creation was successful -> less nesting => easier to read 🙂
Hey @radeksimko no worries, I know you are busy. Comments addressed, hopefully good to go. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
aws/resource_aws_kinesis_stream.go
Outdated
return err | ||
} | ||
return resource.NonRetryableError(err) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually do you mind adding an error check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'm happy to click the merge button 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@radeksimko done :)
See: http://docs.aws.amazon.com/kinesis/latest/APIReference/API_CreateStream.html Similar to DynamoDB, only up to 5 kinesis streams are able to be in the create state simultaneously otherwise a LimitExceededException is thrown as can be seen below: ``` * aws_kinesis_stream.test_stream.6: [WARN] Error creating Kinesis Stream: "This request would have exceed the limit on the number of streams that can simultaneously be in CREATING or DELETING for the account XXXX. Limit: 5.", code: "LimitExceededException" * aws_kinesis_stream.test_stream[1]: 1 error(s) occurred ``` We create about 20 streams per environment (read: single Terraform run) and had been seeing these errors frequently. After speaking with AWS support, this is a hard limit (5) and they have no way of tuning per account.
I have spotted tests failing intermittently with the below error. Generally AWS API calls that get throttled, i.e. “Rate exceeded” throw a ThrottleException and are retried. It seems however that Kinesis has some discrepancies with other services in terms of what error is raised. Looks suspiciously related to: aws/aws-sdk-go#1376 ``` * aws_kinesis_stream.test_stream.13: Unable to create stream: LimitExceededException: Rate exceeded for stream terraform-kinesis-test-3218280527529094251-13 under account 673337093959. status code: 400, request id: cf693bc0-a1a7-3610-9d3c-0dec79efd2d3 * aws_kinesis_stream.test_stream[4]: 1 error(s) occurred: ```
@radeksimko Hopefully sorted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks! |
See: http://docs.aws.amazon.com/kinesis/latest/APIReference/API_CreateStream.html
Similar to DynamoDB, only up to 5 kinesis streams are able to be in the
CREATING state simultaneously otherwise a LimitExceededException is
thrown as can be seen below:
We create about 20 streams per environment (read: single Terraform run)
and had been seeing these errors frequently, having to do several
terraform apply
to get a clean run. After speaking with AWS support,this is a hard limit (5) and they have no way of tuning per account.