-
Notifications
You must be signed in to change notification settings - Fork 161
feat(csharp/test/Drivers/Databricks): Support token refresh to extend connection lifetime #3177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(csharp/test/Drivers/Databricks): Support token refresh to extend connection lifetime #3177
Conversation
…connection lifetime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The change looks fine overall, but I had a lot of trouble following the logic in DatabricksConnection.CreateHttpHandler and I wonder whether there might be some small changes that would make it easier to understand.
Feel free to push back if you prefer to check in as-is.
| baseHandler = new OAuthDelegatingHandler(baseHandler, tokenProvider); | ||
| } | ||
|
|
||
| // Add token exchange handler if token renewal is enabled and the auth type is OAuth access token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mutually exclusive with the if at line 340, yes? The former is for client_credentials while this is for a user token? I'm wondering whether there might be some way of making this more obvious in the code, because _httpClient is being initialized in both but differently and it looked a bit tricky to reason about. Perhaps this second if could instead be an else if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will make that change
| Properties.TryGetValue(DatabricksParameters.OAuthScope, out string? scope); | ||
|
|
||
| HttpClient OauthHttpClient = new HttpClient(HiveServer2TlsImpl.NewHttpClientHandler(TlsOptions, _proxyConfigurator)); | ||
| if (_httpClient == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this method be called correctly more than once, and if so, then under what circumstances? If it can't be, then there's no need for the null check here. It could be replaced e.g. by a Debug.Assert(_httpClient == null). This would make the method a little easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it should only be called once per connection.
https://github.com/apache/arrow-adbc/blob/main/csharp/src/Drivers/Databricks/DatabricksDatabase.cs#L43
Specifically, it's called during OpenAsync, which is called right after initializing the DatabricksConnection
I added the change you suggested
| HttpClient OauthHttpClient = new HttpClient(HiveServer2TlsImpl.NewHttpClientHandler(TlsOptions, _proxyConfigurator)); | ||
| if (_httpClient == null) | ||
| { | ||
| _httpClient = new HttpClient(HiveServer2TlsImpl.NewHttpClientHandler(TlsOptions, _proxyConfigurator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this was already creating a handler before, but should the credential-related handlers be adding the tracing-related functionality and/or the retry functionality? Right now, tracing is only added to the primary (data) handler and not to either of the credential handlers and retry is being added only to the token exchange path. The code is complex enough that it's not clear whether this is deliberate or accidental.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me, I added the tracing and retry functionality to both auth HTTP clients
|
Please give a chance for @jadewang-db to take a look before merging, thanks |
| /// <summary> | ||
| /// Utility class for decoding JWT tokens and extracting claims. | ||
| /// </summary> | ||
| internal static class JwtTokenDecoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no builtin jwt token decoder in dotnet, what about System.IdentityModel.Tokens.Jwt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System.IdentityModel.Tokens.Jwt is not built-in, we need to add it as an external dependency. I remember last time when we added K4os.Compression.LZ4 dependency it took a long time to get security approval for builds, so not sure if we should add another dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, defer to @CurtHagenlocher on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have hand-rolled this as well.
| { | ||
| // Add retry handler for 503 responses | ||
| baseHandler = new RetryHttpHandler(baseHandler, TemporarilyUnavailableRetryTimeout); | ||
| baseAuthHandler = new RetryHttpHandler(baseAuthHandler, TemporarilyUnavailableRetryTimeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently retryhttphandler only handle 503, is it enough? should we retry on timed out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yes we should retry on other codes, 502, 504, 408. I will create a different retry delegating handler for Auth operations, since it seems different enough from the retry logic of thrift operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also we can have a follow up PR to make current retry handler richer, I don't see they are conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, these are the differences:
Auth retry HTTP handler:
- needs to retry on 502, 503, 504, 408
- needs to use exponential backoff for retries
Thrift retry HTTP handler:
- only retries on 503
- relies on the returned Retry-After header to determine how long to wait
I think we can unify in a future PR so that for both auth and Thrift requests:
- retries on 502, 503, 504, 408
- if Retry-After header is set, use that to determine how long to wait
- otherwise, use exponential backoff
…er status codes (#3186) Follow up of #3177 (comment) ## Proposed Changes - Added support for additional HTTP status codes: 408 (Request Timeout), 502 (Bad Gateway), and 504 (Gateway Timeout), in addition to the existing 503 (Service Unavailable) - Implemented exponential backoff with jitter when no Retry-After header is present ## Testing Unit tests `dotnet test --filter "FullyQualifiedName~RetryHttpHandlerTest"` ``` [xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v3.1.1+bf6400fd51 (64-bit .NET 8.0.7) [xUnit.net 00:00:00.06] Discovering: Apache.Arrow.Adbc.Tests.Drivers.Databricks [xUnit.net 00:00:00.15] Discovered: Apache.Arrow.Adbc.Tests.Drivers.Databricks [xUnit.net 00:00:00.16] Starting: Apache.Arrow.Adbc.Tests.Drivers.Databricks [xUnit.net 00:00:25.28] Finished: Apache.Arrow.Adbc.Tests.Drivers.Databricks Apache.Arrow.Adbc.Tests.Drivers.Databricks test net8.0 succeeded (26.2s) Test summary: total: 14, failed: 0, succeeded: 14, skipped: 0, duration: 26.2s Build succeeded in 30.3s ```
Motivation
In scenarios like PowerBI dataset refresh, if a query runs longer than the OAuth token's expiration time (typically 1 hour for AAD tokens), the connection fails. PowerBI only refreshes access tokens if they have less than 20 minutes of expiration time and never updates tokens after a connection is opened.
This PR implements token refresh functionality in the Databricks ADBC driver using the Databricks token exchange API. When an OAuth token is about to expire within a configurable time limit, the driver automatically exchanges it for a new token with a longer expiration time.
Key Components
Changes
adbc.databricks.token_renew_limitto control when token renewal happensTesting