Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 421 - option to use Earthdata User Acceptance Testing (UAT) system #426

Merged
merged 63 commits into from
May 10, 2024

Conversation

danielfromearth
Copy link
Collaborator

@danielfromearth danielfromearth commented Jan 11, 2024

GitHub Issue: #421

Description

This change:

  • enables the login and use of different Earthdata Login (EDL) and Common Metadata Repository (CMR) systems — that is, either Production (PROD) or User Acceptance Testing (UAT).
  • removes the Auth.get_user_profile() method and use of a client_id.

Local test steps

Passed new UAT unit test

Overview of integration done

Ran the following (with an ID for a TEMPO mission collection in UAT substituted for <UAT Collection ID>):

import earthaccess
auth = earthaccess.login(system=earthaccess.UAT)
results = earthaccess.search_data(concept_id=<UAT Collection ID>)
files = earthaccess.download(results, "./local_folder")

...and this generated an appropriate response, and successfully download the granule files.

Checklist

  • Unit tests added/updated and passing.
  • Integration testing
  • CHANGELOG.md updated
  • Documentation updated (if needed).

📚 Documentation preview 📚: https://earthaccess--426.org.readthedocs.build/en/426/

@danielfromearth danielfromearth linked an issue Jan 11, 2024 that may be closed by this pull request
@danielfromearth danielfromearth marked this pull request as draft January 11, 2024 20:02
@danielfromearth danielfromearth added the enhancement New feature or request label Feb 16, 2024
@mfisher87 mfisher87 changed the title Feature/issue 421 option for other environments Feature/issue 421 option for other environments (UAT, SIT) Feb 20, 2024
@mfisher87
Copy link
Collaborator

Daniel and I discussed integration testing strategy here. Since UAT tests could be more flaky than PROD tests, we decided we will write only unit tests for this PR (plus one sample integration test) and address the problem of flaky tests in another effort. I'll link the new issue to this one.

@danielfromearth
Copy link
Collaborator Author

And with regard to unit tests, we discussed creating a new module which holds all UAT unit tests in the test suite, instead of duplicating+modifying the existing tests within the existing modules.

danielfromearth and others added 2 commits March 5, 2024 14:01
Co-authored-by: Chuck Daniels <chuck@developmentseed.org>
Co-authored-by: Daniel Kaufman <danielfromearth@users.noreply.github.com>
@mfisher87
Copy link
Collaborator

@danielfromearth @chuckwondo sorry for the delay pushing this up. I just forgot 😊

@chuckwondo
Copy link
Collaborator

@chuckwondo

Further, I think we may want to drop SIT, and offer only PROD and UAT.

That's fine with me. I am only away of SIT use-cases currently.

@danielfromearth, just to confirm my suspicion about needing a separate client ID per system/env, did you attempt to test a scenario against UAT where a client ID would be required? I would think that such a test would fail if you were to supply the PROD client ID we currently have hard-coded.

On this branch, I successfully ran earthaccess's login(), search_data(), and download() for a collection in UAT. It found the granules and downloaded the files. So, perhaps this client ID is associated with both "systems"?

I'm not sure that's a valid test of the client ID, as I believe your unit tests simply mock responses, but please correct me if I'm wrong (I haven't looked at all of your relevant unit tests to confirm this). If that's the case, then you may need to make use of vcrpy to make (and record) real requests/responses to see if tests requiring a client ID fail when making a real request to UAT with what I suspect is a PROD-only client ID.

@chuckwondo
Copy link
Collaborator

@chuckwondo

Further, I think we may want to drop SIT, and offer only PROD and UAT.

That's fine with me. I am only away of SIT use-cases currently.

@danielfromearth, just to confirm my suspicion about needing a separate client ID per system/env, did you attempt to test a scenario against UAT where a client ID would be required? I would think that such a test would fail if you were to supply the PROD client ID we currently have hard-coded.

On this branch, I successfully ran earthaccess's login(), search_data(), and download() for a collection in UAT. It found the granules and downloaded the files. So, perhaps this client ID is associated with both "systems"?

I'm not sure that's a valid test of the client ID, as I believe your unit tests simply mock responses, but please correct me if I'm wrong (I haven't looked at all of your relevant unit tests to confirm this). If that's the case, then you may need to make use of vcrpy to make (and record) real requests/responses to see if tests requiring a client ID fail when making a real request to UAT with what I suspect is a PROD-only client ID.

Upon further inspection, I'm inclined to believe that we can completely eliminate the client ID. What makes me believe this is that, on the main branch, I ran all integration tests, and I got 9 failures, which I believe are to be expected when running locally (I recall this from a previous topic of discussion during a "hack day"). I then removed all uses of client ID throughout the entire codebase, reran the integration tests, and got identical results.

There are many tests that download files, and all succeeded without the client ID (other than the ones that also failed with the client ID).

I believe the only case for requiring a client ID is for a web app that requires redirection after login. For EDL to know where to redirect the user, you must supply a client ID and the registered app with that client ID must be configured with a redirect URL. This is not the case for earthaccess, so as far as I can tell, we should be able to completely drop the use of a client ID.

However, since I'm not completely confident in my line of thinking here (even though test results seem to point in this direction), I have written email to support@earthdata.nasa.gov to get a definitive answer for us.

@danielfromearth
Copy link
Collaborator Author

@chuckwondo

Further, I think we may want to drop SIT, and offer only PROD and UAT.

That's fine with me. I am only away of SIT use-cases currently.

@danielfromearth, just to confirm my suspicion about needing a separate client ID per system/env, did you attempt to test a scenario against UAT where a client ID would be required? I would think that such a test would fail if you were to supply the PROD client ID we currently have hard-coded.

On this branch, I successfully ran earthaccess's login(), search_data(), and download() for a collection in UAT. It found the granules and downloaded the files. So, perhaps this client ID is associated with both "systems"?

I'm not sure that's a valid test of the client ID, as I believe your unit tests simply mock responses, but please correct me if I'm wrong (I haven't looked at all of your relevant unit tests to confirm this). If that's the case, then you may need to make use of vcrpy to make (and record) real requests/responses to see if tests requiring a client ID fail when making a real request to UAT with what I suspect is a PROD-only client ID.

Right, the unit tests mock responses. What I ran for this test was not a unit test; instead, I manually ran earthaccess in a Python kernel to login and download files — for my case, TEMPO data granules — from UAT. And then manually confirmed that the downloaded files were the expected files.

@danielfromearth
Copy link
Collaborator Author

@mfisher87, @chuckwondo, would you be able to help address the new readthedocs failures that have cropped up after merging main?

@chuckwondo
Copy link
Collaborator

@mfisher87, @chuckwondo, would you be able to help address the new readthedocs failures that have cropped up after merging main?

@danielfromearth, take a look at the details of the failure: https://readthedocs.org/projects/earthaccess/builds/24322214/

You'll see this in the error output:

WARNING -  griffe: earthaccess/auth.py:251: No type or annotation for parameter 'system'
WARNING -  griffe: earthaccess/auth.py:251: Parameter 'system' does not appear in the function signature

Given that, here's what I see in the code in your branch that would cause that warning:

def get_session(self, bearer_token: bool = True) -> requests.Session:
"""Returns a new request session instance.
Parameters:
bearer_token: whether to include bearer token
system: optional EARTHDATA environment to use

Can you see the issue? In your docstring, you document a system parameter to the get_session method, but there is no such parameter for that method. Should there be, or should you remove the entry from the docstring?

@danielfromearth
Copy link
Collaborator Author

Thanks @chuckwondo for identifying the change needed! I was overthinking it, and didn't realize the other messages were simply "INFO" messages and not "warnings" 🤦

@danielfromearth

This comment was marked as outdated.

@danielfromearth
Copy link
Collaborator Author

danielfromearth commented May 8, 2024

Do we want to remove SIT?

@chuckwondo
Copy link
Collaborator

Do we want to remove SIT?

I suggest we do.

Upon further inspection, I'm inclined to believe that we can completely eliminate the client ID. What makes me believe this is that, on the main branch, I ran all integration tests, and I got 9 failures, which I believe are to be expected when running locally (I recall this from a previous topic of discussion during a "hack day"). I then removed all uses of client ID throughout the entire codebase, reran the integration tests, and got identical results.

There are many tests that download files, and all succeeded without the client ID (other than the ones that also failed with the client ID).

I believe the only case for requiring a client ID is for a web app that requires redirection after login. For EDL to know where to redirect the user, you must supply a client ID and the registered app with that client ID must be configured with a redirect URL. This is not the case for earthaccess, so as far as I can tell, we should be able to completely drop the use of a client ID.

However, since I'm not completely confident in my line of thinking here (even though test results seem to point in this direction), I have written email to support@earthdata.nasa.gov to get a definitive answer for us.

Further, I think we should wait for me to get a reply from support@earthdata.nasa.gov regarding the client ID. Ideally, my hunch is correct, and we can simply remove all uses of the client ID. I did get an automated reply, so hopefully I'll get an answer this week. Sorry to hold this up further, but I'm not comfortable with leaving the questions surrounding the client ID unanswered.

@mfisher87
Copy link
Collaborator

mfisher87 commented May 10, 2024

@chuckwondo could you also start a discussion on Slack about the client ID? I think Luis likely knows more but has limited time to catch up on this PR. I am fairly sure you're right that we don't need it. I vaguely remember discussions of using it for metrics purposes, but I think I'm confusing that with the user agent string.

Nevermind I see you've looped him in to the related issue :)

Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic! I have only a handful of very minor grammatical suggestions (mostly around consistent "system" wording), and a couple of very minor code tweaks.

CHANGELOG.md Outdated Show resolved Hide resolved
docs/howto/authenticate.md Outdated Show resolved Hide resolved
earthaccess/api.py Outdated Show resolved Hide resolved
earthaccess/api.py Outdated Show resolved Hide resolved
earthaccess/auth.py Outdated Show resolved Hide resolved
earthaccess/search.py Outdated Show resolved Hide resolved
earthaccess/search.py Outdated Show resolved Hide resolved
earthaccess/search.py Outdated Show resolved Hide resolved
earthaccess/search.py Outdated Show resolved Hide resolved
@danielfromearth danielfromearth self-assigned this May 10, 2024
@danielfromearth danielfromearth changed the title Issue 421 - option to use testing environments (UAT, SIT) Issue 421 - option to use Earthdata User Acceptance Testing (UAT) system May 10, 2024
Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done @danielfromearth!

@chuckwondo chuckwondo merged commit f79a005 into main May 10, 2024
17 checks passed
@chuckwondo chuckwondo deleted the feature/issue-421-option-for-other-environments branch May 10, 2024 16:53
@danielfromearth
Copy link
Collaborator Author

Thank you @chuckwondo for your close look at all of the changes in this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to use UAT environment
4 participants