Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(add-bearer-token-header) add authorization header for Bearer token #88

Merged
merged 1 commit into from
May 6, 2021

Conversation

plooploops
Copy link
Contributor

@plooploops plooploops commented Apr 24, 2021

Add retry logic for refreshing tokens for Gen3Auth.

New Features

Breaking Changes

Bug Fixes

  • Include retry logic for caching refreshed access tokens.

Improvements

  • Added notes for Python set up and manual testing in devTest.md
  • Additional unit tests for Gen3Auth / Gen3Submission

Dependency updates

Deployment changes

Copy link
Contributor

@Avantol13 Avantol13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! Thanks for the contribution, excited to work with you on this.

There should have been a pull request template to fill out, do you know if that showed up when you initially created the PR? Either way, can you change your description and follow this template:

<!--
Make sure to follow the DEV guidelines (https://gen3.org/resources/developer/dev-introduction/) before asking for review.

- Describe what this pull request does.
- Add short descriptive bullet points for each section if relevant. Keep in mind that they will be parsed automatically to generate official release notes.
- Test manually.
- Maintain or increase the test coverage (if relevant).
- Update the documentation, or justify if not needed.

-->

### New Features


### Breaking Changes


### Bug Fixes


### Improvements


### Dependency updates


### Deployment changes
<!-- This section should only contain important things devops should know when updating service versions. -->

README.md Outdated
@@ -34,3 +34,59 @@ To get the latest released version of the SDK:
This SDK exposes a Command Line Interface (CLI). You can now import functions from `gen3` into your own Python scripts or you can use the command line interface:

`gen3 --help`


## Local Development
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a separate development page that covers most of this. https://github.com/uc-cdis/gen3sdk-python/blob/master/docs/howto/devTest.md if there are pieces missing, feel free to modify that page instead

Copy link
Contributor Author

@plooploops plooploops Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, moved these over.

@@ -198,8 +199,22 @@ def submit_record(self, program, project, json):

"""
api_url = "{}/api/v0/submission/{}/{}".format(self._endpoint, program, project)
output = requests.put(api_url, auth=self._auth_provider, json=json)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the auth provider should already handle the population of the authorization header behind the scenes, are you sure this wasn't working before you added this code? https://github.com/uc-cdis/gen3sdk-python/blob/feat/merge-refactor/gen3/auth.py#L214-L227 If there's some bug with this logic, we should find a way to correct the Auth class, not handle the headers here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In using the existing version, it looks like we get the following from using a submit_record call:

  File "/sheepdog/sheepdog/blueprint/routes/views/__init__.py", line 59, in get_programs
    auth.validate_request(aud={"openid"}, purpose=None)
  File "/usr/local/lib/python3.6/site-packages/authutils/token/validate.py", line 236, in validate_request
    raise JWTError("no authorization header provided")
authutils.errors.JWTError: [401] - Authentication Error: no authorization header provided

However, in trying to replicate the 401's, the error doesn't occur with Sheepdog every time.

In looking into the Auth class, it looks like there's a couple of spots that may be interesting:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made an update to include retry logic around writing to files for refresh_token.

Looks like the calls should include the header; this could also be the case of using an older version of the lib.

@@ -514,11 +517,12 @@ def index_object_manifest(
commons_url = commons_url.strip("/")
# if running locally, indexd is deployed by itself without a location relative
# to the commons
if "http://localhost" in commons_url:
service_location = ""
if "http://localhost" in commons_url and include_service_location:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you describe the behavior you're expecting here and/or why the change (was this not working with a local instance)? There's similar logic elsewhere that also might be useful reference: https://github.com/uc-cdis/gen3sdk-python/blob/feat/merge-refactor/gen3/metadata.py#L55-L62

Copy link
Contributor Author

@plooploops plooploops Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was breaking in the unit tests under test_manifests.py

For example, if include_service_location is set to True, the unit test would include the following extra /index in the test:

Detail 404 Client Error: no record found for url: http://localhost:8001/index/index/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so I'd rather keep this similar to the link I supplied where the variable is not a boolean, but the actual string path of the service relative to the domain. These won't change often, but it's probably better to leave that option open instead of hard-coding /index without a way to change it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated logic.

tests/test_submission.py Outdated Show resolved Hide resolved
@plooploops plooploops force-pushed the fix/add-bearer-token-header branch 6 times, most recently from 56360cd to 98080f3 Compare April 28, 2021 22:55
@plooploops plooploops requested a review from Avantol13 April 28, 2021 23:58
You can set up a Python development environment with a virtual environment:

```bash
python3 -m venv py38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while everything should work for python ^3.6, I'd prefer to generalize specific versions to just 3 where possible. So python3 -m venv py3 instead of 38

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated based on comments.

@@ -514,11 +517,12 @@ def index_object_manifest(
commons_url = commons_url.strip("/")
# if running locally, indexd is deployed by itself without a location relative
# to the commons
if "http://localhost" in commons_url:
service_location = ""
if "http://localhost" in commons_url and include_service_location:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so I'd rather keep this similar to the link I supplied where the variable is not a boolean, but the actual string path of the service relative to the domain. These won't change often, but it's probably better to leave that option open instead of hard-coding /index without a way to change it

@@ -38,6 +38,85 @@ def test_token_cache():
assert cache_file == expected


def test_refresh_access_token(mock_gen3_auth):
with patch("gen3.auth.get_access_token_with_key") as mock_access_token:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not super consistent about this, but I'd prefer that each test have a docstring with a quick sentence about what it's testing. """Make sure that access token ends up in header when refresh is called""" or something like that. same for other tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included in the added tests.

gen3/auth.py Outdated
@@ -267,18 +267,37 @@ def refresh_access_token(self):
cache_file = token_cache_file(
self._refresh_token and self._refresh_token["api_key"] or self._wts_idp
)

sleep_amount_seconds = 1
retry_total = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we actually already have a backoff library with configurable settings I'd rather reuse than implement custom retry logic (example: https://github.com/uc-cdis/gen3sdk-python/blob/feat/merge-refactor/gen3/jobs.py#L110) using https://github.com/litl/backoff. You'll have to change the logic a bit because it retries on an Exception, so instead of _write_to_file being a boolean you could just raise exceptions and the backoff library/cfg will retry the whole function if you decorate it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included using the backoff library.

@plooploops plooploops force-pushed the fix/add-bearer-token-header branch 4 times, most recently from 099130d to 8834b97 Compare April 30, 2021 02:01
@@ -520,6 +521,9 @@ def index_object_manifest(
if not commons_url.endswith(service_location):
commons_url += "/" + service_location

print(commons_url)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(commons_url)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

@plooploops plooploops force-pushed the fix/add-bearer-token-header branch from 8834b97 to c5c4328 Compare May 4, 2021 00:17
```

There are various ways to select a subset of python unit-tests - see: https://stackoverflow.com/questions/36456920/is-there-a-way-to-specify-which-pytest-tests-to-run-from-a-file

### Manual Testing

You can also set up credentials to submit metadata to your data commons. This assumes that you can get API access by downloading your [credentials.json](https://gen3.org/resources/user/using-api/#credentials-to-send-api-requests).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata is sometimes a loaded term, I would prefer to explicitly say something like "submit data to the graph in your data commons".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, updated!

@plooploops plooploops force-pushed the fix/add-bearer-token-header branch from c5c4328 to dd55435 Compare May 4, 2021 20:54
@xritter1
Copy link
Contributor

xritter1 commented May 5, 2021

Hello @plooploops, could you take a look at the output of the Wool test before we can merge this PR? Thanks!


`Black output:
--- gen3/auth.py
+++ blackened
@@ -255,11 +255,11 @@
_response.request = newreq

     return _response

 def refresh_access_token(self):
  •    """ Get a new access token """
    
  •    """Get a new access token"""
       if self._use_wts:
           self._access_token = get_access_token_from_wts(
               self._wts_namespace, self._wts_idp
           )
       else:
    

@@ -296,11 +296,11 @@
logging.warning("failed to write token cache file: " + cache_file)
logging.warning(str(e))
raise e

 def get_access_token(self):
  •    """ Get the access token - auto refresh if within 5 minutes of expiration """
    
  •    """Get the access token - auto refresh if within 5 minutes of expiration"""
       if not self._access_token:
           cache_file = token_cache_file(
               self._refresh_token and self._refresh_token["api_key"] or self._wts_idp
           )
           if os.path.isfile(cache_file):
    

{'message': 'Resource not accessible by integration', 'documentation_url': 'https://docs.github.com/rest/reference/issues#create-an-issue-comment'}`

@plooploops plooploops force-pushed the fix/add-bearer-token-header branch from dd55435 to 375304e Compare May 5, 2021 21:44
@xritter1 xritter1 merged commit 9bbfa7b into uc-cdis:master May 6, 2021
@plooploops
Copy link
Contributor Author

Thanks @Avantol13 and @xritter1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants