-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
token refresh offset #12136
token refresh offset #12136
Conversation
sdk/identity/azure-identity/azure/identity/_internal/aad_client_base.py
Outdated
Show resolved
Hide resolved
sdk/identity/azure-identity/azure/identity/_internal/aad_client_base.py
Outdated
Show resolved
Hide resolved
/azp run python - identity - ci |
Azure Pipelines successfully started running 1 pipeline(s). |
|
||
def get_cached_access_token(self, scopes, query=None): | ||
# type: (Sequence[str], Optional[dict]) -> Optional[AccessToken] | ||
tokens = self._cache.find(TokenCache.CredentialType.ACCESS_TOKEN, target=list(scopes), query=query) | ||
for token in tokens: | ||
expires_on = int(token["expires_on"]) | ||
if expires_on - 300 > int(time.time()): | ||
if expires_on - 30 > int(time.time()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be
if expires_on - 30 > int(time.time()): | |
if expires_on - self._token_refresh_timeout > int(time.time()): |
or is there some rationale for always using 30 seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not _token_refresh_timeout.
We don't have a clear design for this value but it must be less than _token_refresh_offset (default to 120). Or it will hide the auto refresh feature.
The old one 300 does not meet the requirement so I updated it to 30.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we need an explicit margin here. The 1s margin in if expires_on > int(time.time())
seems okay to me. My reasoning:
- functionally, this line served to hardcode
token_refresh_offset=300
- if all cached tokens would expire within 300 seconds, this method would return
None
, prompting the caller to acquire a new token
- if all cached tokens would expire within 300 seconds, this method would return
token_refresh_offset
will now be observed by callers of this method- when a caller enters its refresh window, it should begin trying to acquire a new token
- while trying to acquire a new token, the caller should return any valid token it has
One bad outcome that could follow is the caller using a token that expires in flight. That request will fail, but the caller's other option was to raise without sending the request at all, because it couldn't acquire a new token. It seems better to try the request, which could after all succeed.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is when it is still in token_refresh_retry_timeout time frame.
Extreme case: user gets a token from us which expires in 1s. It is still in token_refresh_retry_timeout time frame so it does not get refreshed.
vs
They get None from us so it forces a refresh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if the credential is waiting on the retry timeout, it won't try to get a new token, regardless of what it gets back from the cache. Returning None
in that case only guarantees the current request will fail, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. if there is no valid token (it returns None), no matter it is in retry timeout window or not, we will try to get one.
Retry timeout only applies to there is A valid token but it is within the refresh offset window.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I overlooked this behavior. Credentials should observe the retry timeout when the cache is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the behavior we want. If we have no access_token and the first attempt to get one failed, do we really want to hold all requests for 30 seconds before attempting to get one? I think we need to clarify this more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion is if there is no one available, every time user calls our library to get one, we will try it w/o a cool down time.
sdk/identity/azure-identity/azure/identity/_credentials/authorization_code.py
Show resolved
Hide resolved
sdk/identity/azure-identity/azure/identity/_credentials/managed_identity.py
Outdated
Show resolved
Hide resolved
sdk/identity/azure-identity/azure/identity/_internal/aad_client_base.py
Outdated
Show resolved
Hide resolved
|
||
def get_cached_access_token(self, scopes, query=None): | ||
# type: (Sequence[str], Optional[dict]) -> Optional[AccessToken] | ||
tokens = self._cache.find(TokenCache.CredentialType.ACCESS_TOKEN, target=list(scopes), query=query) | ||
for token in tokens: | ||
expires_on = int(token["expires_on"]) | ||
if expires_on - 300 > int(time.time()): | ||
if expires_on - 30 > int(time.time()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we need an explicit margin here. The 1s margin in if expires_on > int(time.time())
seems okay to me. My reasoning:
- functionally, this line served to hardcode
token_refresh_offset=300
- if all cached tokens would expire within 300 seconds, this method would return
None
, prompting the caller to acquire a new token
- if all cached tokens would expire within 300 seconds, this method would return
token_refresh_offset
will now be observed by callers of this method- when a caller enters its refresh window, it should begin trying to acquire a new token
- while trying to acquire a new token, the caller should return any valid token it has
One bad outcome that could follow is the caller using a token that expires in flight. That request will fail, but the caller's other option was to raise without sending the request at all, because it couldn't acquire a new token. It seems better to try the request, which could after all succeed.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Credentials also need to expose the refresh offset in their public API, for the authentication policy.
|
||
def get_cached_access_token(self, scopes, query=None): | ||
# type: (Sequence[str], Optional[dict]) -> Optional[AccessToken] | ||
tokens = self._cache.find(TokenCache.CredentialType.ACCESS_TOKEN, target=list(scopes), query=query) | ||
for token in tokens: | ||
expires_on = int(token["expires_on"]) | ||
if expires_on - 300 > int(time.time()): | ||
if expires_on - 30 > int(time.time()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if the credential is waiting on the retry timeout, it won't try to get a new token, regardless of what it gets back from the cache. Returning None
in that case only guarantees the current request will fail, no?
Do you mean BearerTokenCredentialPolicy? refresh offset is only configurable in ctor. I don't see BearerTokenCredentialPolicy call credential ctors. |
Yes, recall that |
Yes. And given we implemented the cache functionality in get_token. I don't see a requirement to update the default value of _need_new_token |
What if I want to refresh tokens 6 minutes before they expire? |
This is a good point. Please open a separate issue to track it. We cannot fix both of them in same PR because that one needs core changes and per our rule, core changes cannot combine with other changes in same PR. |
sdk/identity/azure-identity/azure/identity/_internal/aad_client_base.py
Outdated
Show resolved
Hide resolved
sdk/identity/azure-identity/tests/test_vscode_credential_async.py
Outdated
Show resolved
Hide resolved
sdk/identity/azure-identity/tests/test_username_password_credential.py
Outdated
Show resolved
Hide resolved
def get_cached_token(self, scopes): | ||
# type: (Iterable[str]) -> Optional[AccessToken] | ||
tokens = self._cache.find(TokenCache.CredentialType.ACCESS_TOKEN, target=list(scopes)) | ||
for token in tokens: | ||
expires_on = int(token["expires_on"]) | ||
if expires_on - 300 > int(time.time()): | ||
if expires_on - 30 > int(time.time()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be using the constant you've defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I commented on another instance of this, I'm not certain we need any margin here. We want credentials to return cached tokens as necessary so long as they're valid; doesn't that imply having the cache return a token right up until its expiry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Margin removed.
try: | ||
self._redeem_refresh_token(scopes, **kwargs) | ||
except Exception: # pylint: disable=broad-except | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be logging refreshes which fail here? Is this already done in _redeem_refresh_token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question.
I am leaning towards not logging it because:
- if there is a valid token available, user will continue to use that one and there is no need to log it.
- if there is no valid token, user cannot get one and we will log that event (already implemented)
if not token: | ||
token = self._client.obtain_token_by_client_certificate(scopes, self._certificate, **kwargs) | ||
elif self._client.should_refresh(token): | ||
try: | ||
self._client.obtain_token_by_client_certificate(scopes, self._certificate, **kwargs) | ||
except Exception: # pylint: disable=broad-except | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic:
if not token:
# get new token
elif should_refrsesh:
try:
# get new token
except Exception:
# swallow
seems to be present in most if not all the credentials. Perhaps it could be moved into a base or mixin, and have the implementation just provide a callback or an override for the # get new token
functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. But different credentials have different ways to refresh/redeem tokens. So I have not found a clean way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of something like this:
class CredentialBase(ABC):
def __init__(self, **kwargs):
self._client = AadClient(...)
def _get_token_impl(*scopes, **kwargs):
if not scopes:
raise ValueError('"get_token" requires at least one scope')
token = self._client.get_cached_access_token(scopes)
if not token:
token = self._request_token(scopes, **kwargs)
elif self._client.should_refresh(token):
try:
self._request_token(scopes, **kwargs)
except Exception: # pylint:disable=broad-except
pass
return token
@abc.abstractmethod
def _request_token(self, *scopes, **kwargs):
pass
class Credential(CredentialBase):
def get_token(*scopes, **kwargs):
"""relevant user-facing docstring"""
return self._get_token_impl(*scopes, **kwargs)
def _request_token(*scopes, **kwargs):
"""get a new token according to this credential's personal idiom"""
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we make a shared credential base?
I would like to have it into a separate issue/PR as code refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring always has a lower priority than new features. Merging this code is an open-ended commitment to maintaining it as is, so it's worth investigating a better organization now. The one I sketched may have its own problems (e.g. multiple inheritance would require some care) but it seems workable. What do you think? Have you tried something similar already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when we do refactoring by adding a shared class for all credentials, we can do further than only this. But I don't want to rush it right before a release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sdk/identity/azure-identity/tests/test_certificate_credential.py
Outdated
Show resolved
Hide resolved
…into regenerate_keys * 'master' of https://github.com/Azure/azure-sdk-for-python: (100 commits) replace aka link (Azure#12597) [ServiceBus] Message/ReceivedMessage Properties alignment with other languages (Azure#12451) Find list of installed packages using pkg_resources (Azure#12591) token refresh offset (Azure#12136) updates (Azure#12595) User authentication samples (Azure#11343) Remove unnecessary base class (Azure#12374) Sequence -> Iterable for scopes (Azure#12579) Disable apistubgen step until issue is fixed (Azure#12594) fix pylint issue (Azure#12578) fix name in example (Azure#12572) Update tests.md (Azure#12574) Add stress tests for max batch size/prefetch, and for unsettled message receipt. Add capability to not auto-complete and adjust max_batch_size into the base stress tester. (Azure#12344) [formrecognizer] Capitalize enum values (Azure#12540) Update Pinned CI Packages (Azure#11586) remove async response hook policy (Azure#12529) update to target new warden version (Azure#12522) fix azure-storage-blob readme and samples issues (Azure#12511) code fence not formatted appropriately (Azure#12520) Fix documentation typo (Azure#12519) ...
…into regenerate_certs * 'master' of https://github.com/Azure/azure-sdk-for-python: (100 commits) replace aka link (Azure#12597) [ServiceBus] Message/ReceivedMessage Properties alignment with other languages (Azure#12451) Find list of installed packages using pkg_resources (Azure#12591) token refresh offset (Azure#12136) updates (Azure#12595) User authentication samples (Azure#11343) Remove unnecessary base class (Azure#12374) Sequence -> Iterable for scopes (Azure#12579) Disable apistubgen step until issue is fixed (Azure#12594) fix pylint issue (Azure#12578) fix name in example (Azure#12572) Update tests.md (Azure#12574) Add stress tests for max batch size/prefetch, and for unsettled message receipt. Add capability to not auto-complete and adjust max_batch_size into the base stress tester. (Azure#12344) [formrecognizer] Capitalize enum values (Azure#12540) Update Pinned CI Packages (Azure#11586) remove async response hook policy (Azure#12529) update to target new warden version (Azure#12522) fix azure-storage-blob readme and samples issues (Azure#12511) code fence not formatted appropriately (Azure#12520) Fix documentation typo (Azure#12519) ...
…into ta_opinion_mining_sample * 'master' of https://github.com/Azure/azure-sdk-for-python: (124 commits) [formrecognizer] Add type to FormField (Azure#12561) Add example summary for azure-identity readme.md (Azure#12509) Add logging to credentials (Azure#12319) Sdk automation/track2 azure mgmt keyvault (Azure#12638) Remove unnecessary coroutine declaration (Azure#12602) [Cosmos] Fix type comment (Azure#12598) replace aka link (Azure#12597) [ServiceBus] Message/ReceivedMessage Properties alignment with other languages (Azure#12451) Find list of installed packages using pkg_resources (Azure#12591) token refresh offset (Azure#12136) updates (Azure#12595) User authentication samples (Azure#11343) Remove unnecessary base class (Azure#12374) Sequence -> Iterable for scopes (Azure#12579) Disable apistubgen step until issue is fixed (Azure#12594) fix pylint issue (Azure#12578) fix name in example (Azure#12572) Update tests.md (Azure#12574) Add stress tests for max batch size/prefetch, and for unsettled message receipt. Add capability to not auto-complete and adjust max_batch_size into the base stress tester. (Azure#12344) [formrecognizer] Capitalize enum values (Azure#12540) ...
…into regenerate_secrets * 'master' of https://github.com/Azure/azure-sdk-for-python: (96 commits) replace aka link (Azure#12597) [ServiceBus] Message/ReceivedMessage Properties alignment with other languages (Azure#12451) Find list of installed packages using pkg_resources (Azure#12591) token refresh offset (Azure#12136) updates (Azure#12595) User authentication samples (Azure#11343) Remove unnecessary base class (Azure#12374) Sequence -> Iterable for scopes (Azure#12579) Disable apistubgen step until issue is fixed (Azure#12594) fix pylint issue (Azure#12578) fix name in example (Azure#12572) Update tests.md (Azure#12574) Add stress tests for max batch size/prefetch, and for unsettled message receipt. Add capability to not auto-complete and adjust max_batch_size into the base stress tester. (Azure#12344) [formrecognizer] Capitalize enum values (Azure#12540) Update Pinned CI Packages (Azure#11586) remove async response hook policy (Azure#12529) update to target new warden version (Azure#12522) fix azure-storage-blob readme and samples issues (Azure#12511) code fence not formatted appropriately (Azure#12520) Fix documentation typo (Azure#12519) ...
No description provided.