Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Retry API calls that return a 500 error #2041

Merged
merged 1 commit into from
Jul 17, 2024

Conversation

hessjcg
Copy link
Collaborator

@hessjcg hessjcg commented Jul 15, 2024

This updates the retry logic for API calls made by the connector:

  • Attempts to get an authentication token from the Google auth API are always retried 5 times.
  • SQL Admin API requests are retried up to 5 times if the http response status code is a 5xx. Otherwise they are fatal on the first error.

This also implements the exponential backoff logic defined used in the go connector.
See: GoogleCloudPlatform/cloud-sql-go-connector#781

@hessjcg hessjcg requested a review from a team as a code owner July 15, 2024 18:54
@hessjcg hessjcg force-pushed the gh-2029-retry-api-error-500 branch from ebc9d0a to 5faf4ec Compare July 15, 2024 18:55
Copy link
Collaborator

@jackwotherspoon jackwotherspoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple nits, will let @enocom review for Java readability

should we make this a PR a fix:?

Comment on lines 41 to 44
* response with an error code in the 500 range.
*
* @param e the exception
* @return false if this is a http response with a 500 status code, otherwise true.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* response with an error code in the 500 range.
*
* @param e the exception
* @return false if this is a http response with a 500 status code, otherwise true.
* response with an error code in the 5xx range.
*
* @param e the exception
* @return false if this is a http response with a 5xx status code, otherwise true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

*
* <p>The formula is: base * multi^(attempt + 1 + random)
*
* <p>With base = 200ms and multi = 1.1618, and random = [0.0, 1.0), the backoff values would fall
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: value of multi should be 1.618

Suggested change
* <p>With base = 200ms and multi = 1.1618, and random = [0.0, 1.0), the backoff values would fall
* <p>With base = 200ms and multi = 1.618, and random = [0.0, 1.0), the backoff values would fall

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

new ApiClientRetryingCallable<>(
() -> {
int attempt = counter.incrementAndGet();
if (attempt < 3) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for choosing 3 instead of 5 here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to demonstrate that it returns after a successful attempt before exhausting the total number of retries.

I added another test that always fails to show that it stops at 5 attempts and then fails.

@enocom enocom changed the title chore: Retry API calls that return a 500 error. Fixes #2029. feat: Retry API calls that return a 500 error Jul 16, 2024
@enocom
Copy link
Member

enocom commented Jul 16, 2024

This should be a "feature" -- it will have a noticeable usability change.

// Only retry if the error is an HTTP response with a 5xx error code.
if (e instanceof HttpResponseException) {
HttpResponseException re = (HttpResponseException) e;
return re.getStatusCode() < 500 || re.getStatusCode() > 599;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would re.getStatusCode() > 599 apply? I think re.getStatusCode() < 500 will get the job done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator Author

@hessjcg hessjcg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a new test to ConnectorTest. This test the SqlAdmin client library actually responds correctly and the ApiClientRetryingCallable works as expected when the service intermittently responds with 502 errors.

Comment on lines 41 to 44
* response with an error code in the 500 range.
*
* @param e the exception
* @return false if this is a http response with a 500 status code, otherwise true.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 41 to 44
* response with an error code in the 500 range.
*
* @param e the exception
* @return false if this is a http response with a 500 status code, otherwise true.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Only retry if the error is an HTTP response with a 5xx error code.
if (e instanceof HttpResponseException) {
HttpResponseException re = (HttpResponseException) e;
return re.getStatusCode() < 500 || re.getStatusCode() > 599;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

*
* <p>The formula is: base * multi^(attempt + 1 + random)
*
* <p>With base = 200ms and multi = 1.1618, and random = [0.0, 1.0), the backoff values would fall
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

new ApiClientRetryingCallable<>(
() -> {
int attempt = counter.incrementAndGet();
if (attempt < 3) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to demonstrate that it returns after a successful attempt before exhausting the total number of retries.

I added another test that always fails to show that it stops at 5 attempts and then fails.

@hessjcg hessjcg force-pushed the gh-2029-retry-api-error-500 branch from 5faf4ec to 2e8c190 Compare July 17, 2024 16:50
@hessjcg hessjcg requested a review from jackwotherspoon July 17, 2024 16:50
@hessjcg hessjcg force-pushed the gh-2029-retry-api-error-500 branch from 2e8c190 to e1380fe Compare July 17, 2024 16:50
@hessjcg hessjcg enabled auto-merge (squash) July 17, 2024 16:52
@hessjcg hessjcg force-pushed the gh-2029-retry-api-error-500 branch from e1380fe to cf3d027 Compare July 17, 2024 18:09
@hessjcg hessjcg merged commit d76e892 into main Jul 17, 2024
17 checks passed
@hessjcg hessjcg deleted the gh-2029-retry-api-error-500 branch July 17, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants