-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retries for Access Token Request #19
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,8 @@ | |
import datetime | ||
import hashlib | ||
import logging | ||
import time | ||
from requests.models import Response | ||
|
||
from django.conf import settings | ||
from django.contrib.auth.models import AbstractUser | ||
|
@@ -175,7 +177,7 @@ def oauth2_client_secret(self): | |
@property | ||
def user_api_url(self): | ||
return '{}/api/user/v1/'.format(self.lms_url_root.strip('/')) | ||
|
||
@property | ||
def access_token(self): | ||
""" Returns an access token for this site's service user. | ||
|
@@ -192,15 +194,31 @@ def access_token(self): | |
|
||
if not access_token: | ||
url = f'{self.oauth2_provider_url}/access_token' | ||
access_token, expiration_datetime = EdxRestApiClient.get_oauth_access_token( | ||
url, | ||
self.oauth2_client_id, | ||
self.oauth2_client_secret, | ||
token_type='jwt' | ||
) | ||
|
||
expires = (expiration_datetime - datetime.datetime.utcnow()).seconds | ||
cache.set(key, access_token, expires) | ||
retries = [15, 30, 45] # Retry delays in seconds | ||
attempt = 0 | ||
|
||
while attempt < len(retries) + 1: | ||
try: | ||
log.info(f'Feching access token for URL: {url}') | ||
access_token, expiration_datetime = EdxRestApiClient.get_oauth_access_token( | ||
url, | ||
self.oauth2_client_id, | ||
self.oauth2_client_secret, | ||
token_type='jwt' | ||
) | ||
expires = (expiration_datetime - datetime.datetime.utcnow()).seconds | ||
cache.set(key, access_token, expires) | ||
return access_token | ||
except Exception as e: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should not be catching Exception here, for the following reasons:
The solution is to catch only the error that you are interested in (HTTPError, probably). |
||
log.info(f'Getting exception wile getting token {e}') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should not be using log.info, but |
||
if isinstance(e.response, Response) and e.response.status_code == 403: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will crash if e does not have a |
||
attempt += 1 | ||
if attempt > len(retries): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the part that is the most problematic. We are facing a problem of rate limiting, and we are fixing it with retries? This will only increase the problem. If anything, we are guaranteed to face the problem again in the future, as the number of client increases. The "right" fix is to increase the rate limit value for that client. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All suggestions are really good and we can address most of them. Thank you for pointing them out. Issue happens when jobs run in parallel and it reaches the rate limit. So we have decided to optimize the jobs; add token cache in jobs and improve the logic so the there should be less requests for token. Another suggestion was to create a universal token for worker which can be used for every site, @taimoor-ahmed-1 was looking into the feasibility of that. Once all jobs are optimized and shipped, we shouldn't be getting an issue |
||
raise e | ||
|
||
seconds = retries[attempt - 1] | ||
log.info(f'Retrying attempt no: {attempt} for access token request after seconds: {seconds} because of exceptoin {e}') | ||
time.sleep(seconds) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any |
||
|
||
return access_token | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kindly remove