Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced timeouts for MSAL calls. #2562

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

machavan
Copy link
Contributor

@machavan machavan commented Dec 12, 2024

Description:

Calls to MSAL authentication library done from the Java driver code use future.get(), potentially leading to hangs in case the library does not receive timely response from the server. All these calls done as part of getFedAuthToken need to have timeouts imposed. The timeout period is calculated based on remaining time wrt login timeout, and passed to future.get(, ) calls with this fix.

Tests:

  • Existing tests pass
  • Added new tests

Copy link

codecov bot commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 16.27907% with 36 lines in your changes missing coverage. Please review.

Project coverage is 51.14%. Comparing base (6829848) to head (9051847).

Files with missing lines Patch % Lines
...microsoft/sqlserver/jdbc/SQLServerMSAL4JUtils.java 15.78% 31 Missing and 1 partial ⚠️
.../microsoft/sqlserver/jdbc/SQLServerConnection.java 25.00% 3 Missing ⚠️
...osoft/sqlserver/jdbc/SQLServerSecurityUtility.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2562      +/-   ##
============================================
+ Coverage     51.03%   51.14%   +0.10%     
- Complexity     3919     3927       +8     
============================================
  Files           147      147              
  Lines         33456    33478      +22     
  Branches       5604     5609       +5     
============================================
+ Hits          17074    17122      +48     
+ Misses        13971    13943      -28     
- Partials       2411     2413       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lilgreenbird lilgreenbird added this to the 12.10.0 milestone Dec 16, 2024
@@ -6110,10 +6110,11 @@ private SqlAuthenticationToken getFedAuthToken(SqlFedAuthInfo fedAuthInfo) throw
}

while (true) {
int millisecondsRemaining = timerRemaining(timerExpire);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is 0 or negative does it make sense to try the calls below but not throw timed out exception here (or from timerRemaining call)? Or just let the first calls throw the exception?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's min value of 1ms, but the question is still valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, we are just letting the first call throw it

- Replaced lock with tryLock to avoid potential long waiting for other
  threads while one thread is taking long to complete.
try {
//Just try to acquire the lock and if can't then proceed to attempt to get the token
lockAcquired = lock.tryLock(Math.min(millisecondsRemaining, TOKEN_LOCK_WAIT_DURATION_MS), TimeUnit.MILLISECONDS);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we using lockAcquired? We should go ahead only if it returns true(lock acquired), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to attempt acquiring the lock with a timeout of 5 seconds. If the lock cannot be acquired within this time, proceed to attempt token acquisition regardless of whether the earlier thread(s) have released the lock or not

lockAcquired is used to check in the finally block if the lock should be released or not.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if it is ok to run 2 threads in the block why even attempt to check lock? It is either ok to run 2 or more threads or it is not. If ok (and the lack of checking the result of tryLock shows it is ok) then do not do any locking just run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to optimize the token acquisition process, the first caller succeeding does caching which is then leveraged by subsequent threads. However, if the first thread takes considerable time, then we want the others to also go and try after waiting for a while.

If we were to let say 30 threads try in parallel, they would all miss the cache and hit the AAD auth endpoints to get their tokens at the same time, stressing the auth endpoint.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood the intent, but would Semaphore be a more readable solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. The code has been updated along with detailed comment describing the intent.

- Added detailed comment for the usage of semaphore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Under Peer Review
Development

Successfully merging this pull request may close these issues.

5 participants