Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Retry on too many auth requests #355

Merged
merged 11 commits into from
Oct 3, 2024
Merged

Conversation

rauchy
Copy link
Contributor

@rauchy rauchy commented Oct 2, 2024

Changes

This PR addresses an issue encountered during authentication, where the server returns a 429 status code when fetching OIDC endpoints from a well-known location, particularly during frequent authentication requests. The update introduces a retry mechanism within the auth flow to mitigate the impact of 429 errors.

Modified ApiClient object construction to allow unauthenticated calls by including authentication function injection, where within the OIDC workflow act as no-op actions. DatabricksConfig would now instantiate an ApiClient instead of a standard CommonsHttpClient in these cases.
errors during a complete authentication workflow, especially under high concurrent request conditions.

Tests

  • Added unit tests to verify the retry functionality upon encountering a 429 response.
  • Included a manual test to simulate real-world scenarios, ensuring the retry mechanism effectively handles 429

@rauchy rauchy requested a review from mgyucht October 2, 2024 09:20
@rauchy rauchy force-pushed the rauchy/retry-on-too-many-requests branch from 5c68f5d to 7409ea2 Compare October 2, 2024 10:16
Comment on lines 110 to 115
public ApiClient(
DatabricksConfig config,
Timer timer,
Function<Void, Map<String, String>> authenticateFunc,
Function<Void, String> getHostFunc,
Function<Void, String> getAuthTypeFunc) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private ApiClient(Builder builder) {...}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 102 to 107
this(
config,
timer,
v -> config.authenticate(),
v -> config.getHost(),
v -> config.getAuthType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this(new Builder().withConfig(config.resolve()).withTimer(timer))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

public ApiClient build() {
return new ApiClient(config, timer, authenticateFunc, getHostFunc, getAuthTypeFunc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return new ApiClient(this);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this(
config,
timer,
v -> config.authenticate(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config::authenticate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static class Builder {
private DatabricksConfig config;
private Timer timer;
private Function<Void, Map<String, String>> authenticateFunc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Supplier for these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppliers look good for replacing the use of Function here, but some functions (authenticate as an example) are overloaded, and Supplier and the :: syntax can't deal with those. So if we choose to use Supplier here, some functions will still be stored under a Function variable, which kinda makes it even more complicated :/

String userAgent = String.format("%s auth/%s", UserAgent.asString(), config.getAuthType());
String authType = getAuthTypeFunc.apply(null);
String userAgent = UserAgent.asString();
if (authType != "") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If getAuthTypeFunc is null, we can jst check if that func is null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Few comments, otherwise we're on the right track.

private RetryStrategyPicker retryStrategyPicker;
private boolean isDebugHeaders;

public Builder withDatabricksConfig(DatabricksConfig config) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this could even be private, since it is only used in the default constructor.

@@ -202,6 +267,8 @@ private <I> Request prepareBaseRequest(String method, String path, I in)
} else if (InputStream.class.isAssignableFrom(in.getClass())) {
InputStream body = (InputStream) in;
return new Request(method, path, body);
} else if (in instanceof String) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: leave a comment that this assumes that the request is already serialized.

rateLimit = 15;
}
private ApiClient(Builder builder) {
this.timer = builder.timer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: let's use sane defaults when those values are not provided in the builder. Ideally ApiClient.Builder().build() would make a more or less working ApiClient. Here, we can use SystemTimer() if not specified.

return new ObjectMapper().readValue(resp.getBody(), OpenIDConnectEndpoints.class);
ApiClient apiClient =
new ApiClient.Builder()
.withDatabricksConfig(this)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to specify the config directly here?

.withTimer(new SystemTimer())
.withAuthenticateFunc(v -> new HashMap<String, String>())
.withGetHostFunc(v -> getHost())
.withGetAuthTypeFunc(v -> "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This we should probably leave out? Otherwise it is not null according to your check.

new ApiClient.Builder()
.withDatabricksConfig(this)
.withTimer(new SystemTimer())
.withAuthenticateFunc(v -> new HashMap<String, String>())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you follow my suggestion from above, we can leave this out.

Comment on lines 72 to 73
.withGetAuthTypeFunc(v -> "")
.withGetHostFunc(v -> "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, these can be removed if you take my suggestions

this.getAuthTypeFunc = builder.getAuthTypeFunc;
this.httpClient = builder.httpClient;
this.accountId = builder.accountId;
this.retryStrategyPicker = builder.retryStrategyPicker;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, can we use the default request-based retry strategy picker if none is provided?

@@ -0,0 +1,73 @@
// These are auto-generated tests for Unified Authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not autogenerated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import java.util.concurrent.*;
import org.junit.jupiter.api.Test;

public class DatabricksAuthLoadTest implements GitHubUtils, ConfigResolving {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave a comment here about this test.

Also small suggestion: let's move this into a benchmark package. It'll be a new convention within Java SDK tests to keep things a bit more organized for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rauchy rauchy force-pushed the rauchy/retry-on-too-many-requests branch from e36b927 to 68ae507 Compare October 3, 2024 12:49
@rauchy
Copy link
Contributor Author

rauchy commented Oct 3, 2024

Great comments @mgyucht, thanks for the thorough review.

The null checks make sense. I eventually went with a more "tell, don't ask" approach by providing no-ops in the private ApiClient constructor. This way we could avoid null paranoia.

@rauchy rauchy requested a review from mgyucht October 3, 2024 12:52
@rauchy rauchy force-pushed the rauchy/retry-on-too-many-requests branch from c6c811c to 68ae507 Compare October 3, 2024 13:01
Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Make sure to run make fmt to fix the format check.

I will take a look at how Maven is running tests to see why this test is being executed. In the meantime, we should be able to use the Disabled annotation (https://junit.org/junit5/docs/5.2.0/api/org/junit/jupiter/api/Disabled.html) to disable the test explicitly.

@rauchy rauchy force-pushed the rauchy/retry-on-too-many-requests branch 9 times, most recently from a50b790 to 1731133 Compare October 3, 2024 15:12
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
…sponses when fetching OIDC endpoints

Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
…he time

Signed-off-by: Omer Lachish <rauchy@users.noreply.github.com>
@rauchy rauchy force-pushed the rauchy/retry-on-too-many-requests branch from 1731133 to 86f3f8b Compare October 3, 2024 15:19
@rauchy rauchy added this pull request to the merge queue Oct 3, 2024
Merged via the queue into main with commit ed4992d Oct 3, 2024
10 checks passed
@rauchy rauchy deleted the rauchy/retry-on-too-many-requests branch October 3, 2024 15:30
rauchy added a commit that referenced this pull request Oct 3, 2024
### Bug Fixes

 * Retry on too many auth requests ([#355](#355)).
@rauchy rauchy mentioned this pull request Oct 3, 2024
rauchy added a commit that referenced this pull request Oct 3, 2024
 * Retry on too many auth requests ([#355](#355)).
github-merge-queue bot pushed a commit that referenced this pull request Oct 3, 2024
### Bug Fixes

* Retry on too many auth requests
([#355](#355)).

Co-authored-by: Omer Lachish <rauchy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants