Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cumulative fixes and updates #97

Merged
merged 11 commits into from
Apr 11, 2023
Merged

Cumulative fixes and updates #97

merged 11 commits into from
Apr 11, 2023

Conversation

valpamp
Copy link
Contributor

@valpamp valpamp commented Apr 3, 2023

Using old issues, PRs and doing new research I:

  • Removed decommissioned datasets
  • Explicitly added Landsat 9
  • Pulled the implementation of partial downloads
  • Fixed download ids

Both the command line and the Python bindings work for all the datasets listed in the README. I also tested the partial download feature implemented by @alexVyth, which works like a charm and is very handy. The most important updates are detailed in this commit f31685c

Some references to decommissioned collection 1 products may still be present in the code (and ideally should be cleaned up), but should not hinder the functionality of the module.

Alexandros Vythoulkas and others added 11 commits March 20, 2021 15:28
Add download resume/skip on partly completed files
Avoid guessing decommissioned Landsat 5 Collection 1 dataset when parsing Landsat 5 IDs by forcing collection = "c2" in if statement
Removed unsupported products (Landsat 5 Collection 1 and Sentinel 2A) and removed deprecated __ncforminfo from __get_tokens and login functions.
…ound multiple dataset ids

This update is related to upstream issues yannforget#45, yannforget/landsatxplore/yannforget#92 and possibly others.
First of all, I removed all collection 1 dataset and the Sentinel-2a dataset, which were decommissioned in the final weeks of 2022.
Then I stumbled into the same issues pointed by @jacquesmoati and @HanDuwol, which were related to the dataset ids being hardcoded into earthexplorer.py rather than extracted from the response to some of the queries. Unfortunately, in order to get the correct dataset ids for each scene, it is not sufficient to have Earth Explorer credentials, but it is also necessary to have access to the API (which can be done here: https://ers.cr.usgs.gov/profile/access). This is due to the fact that the download-option function needed to obtain the dataset id as shown by @HanDuwol yannforget#92 (comment) is only available to approved users.
Since it would be pointless to restrict the use of this Python library to approved users, I implemented a similar workaround as the one proposed by @fkroeber here yannforget#45 (comment), but in a slightly different way which I believe to be more compact and elegant.
Re-added the demo gif, since the landsatxplore.exe is still working, and removed references to decommissioned datasets.
Prepared README for pull request
@neteler
Copy link
Collaborator

neteler commented Apr 5, 2023

Thanks @valpamp, this is really great. I just patched my pip installed landsatxplore version and finally I can download Landsat data again!

For now I just did shallow testing but this looks very promising (using the addon i.landsat.download in GRASS GIS:

i.landsat.download settings=$HOME/.usgs   dataset=landsat_ot_c2_l2 clouds=20   start='2012-11-01' end='2013-12-31'   sort=acquisition_date,cloud_cover output=~/tmp/costarica_landsat8
2 scenes found.
Downloading scene <LC80160532013230LGN01> ...
Download failed with dataset id 1 of 2. Re-trying with the next one.
100%|██████....███████████| 1.11G/1.11G [01:03<00:00, 18.8MB/s]
Downloading scene <LC80160532013358LGN01> ...
Download failed with dataset id 1 of 2. Re-trying with the next one.
100%|██████....███████████| 969M/969M [00:55<00:00, 18.4MB/s]

ls -la ~/tmp/landsat8/
total 2159380
drwxr-xr-x   2 mneteler mneteler        110 Apr  5 22:58 .
drwxrwxr-x. 51 mneteler mneteler      16384 Apr  5 22:54 ..
-rw-r--r--   1 mneteler mneteler 1195051008 Apr  5 22:57 LC08_L1TP_016053_20130818_20200912_02_T1.tar
-rw-r--r--   1 mneteler mneteler 1016128512 Apr  5 22:58 LC08_L1TP_016053_20131224_20200912_02_T1.tar

Great work!

@valpamp
Copy link
Contributor Author

valpamp commented Apr 6, 2023

Thanks a lot @neteler, it's good to see someone else besides me and my colleagues putting this to good use. Unfortunately it looks like this repo is abandoned, so I don't think that the fixes will ever be merged into the official pip package.

@neteler
Copy link
Collaborator

neteler commented Apr 6, 2023

We'll try to convince @yannforget to merge this PR :)

Comment on lines +200 to +201
if id_count+1 < id_num:
print('Download failed with dataset id {:d} of {:d}. Re-trying with the next one.'.format(id_count+1, id_num))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one thing: This warning is somehow confusing (to me):

Suggested change
if id_count+1 < id_num:
print('Download failed with dataset id {:d} of {:d}. Re-trying with the next one.'.format(id_count+1, id_num))
if id_count+1 < id_num:
print('Download failed with dataset id {:d} of {:d}. Re-trying with the next one.'.format(id_count+1, id_num))

In my test (#97 (comment)) it shows up while all scenes seem to be found and downloaded.

It seems to appear with every download, not sure why... Maybe an index issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to the API policy which restricts access to the download-option function to users who have explicitly asked API access. The details are in this commit f31685c. If one could access this function without special permissions, one could always use the download-option function and get the correct dataset-id there. Unfortunately this is not the case, and to avoid forcing all users to ask for API access, the compromise solution was to hard-code all known dataset ids into the library and try different ids until one works.

I understand that printing those statements may be confusing, and probably it would be better to just try all dataset ids, and just in case each and every one of them fails, raise an error or a warning to communicate to the user that the dataset id list needs to be updated by the maintainer.

@yannforget yannforget merged commit fbdebff into yannforget:master Apr 11, 2023
@yannforget
Copy link
Owner

Hi everyone and thank you very much for this pull request! I tried it and everything seems to work on my side.

I don't have the time to maintain the library anymore - as you noticed keeping up with the changes on Earth Explorer is a lot of work. I pushed the updated version on pypi though (v0.15).

I also added you as collaborators so that you can release & merge PRs yourself if needed - please don't hesitate to refuse the invite if you are not interested.

I updated the Github actions as they weren't functionnal anymore (and temporarily disabled the tests in the CI). Please note that pypi credentials are stored as secrets in the repo and that the deploy action is automatically trigerred with each Github release (you can also just edit the version number in the pyproject.toml and trigger the action manually). If at some point you would like to transfer ownership in Pypi, don't hesitate to contact me 😃

Sorry again for the delay!

@valpamp
Copy link
Contributor Author

valpamp commented Apr 13, 2023

Hi @yannforget, thank you so much for this. I would like to help maintaining the repo, but being an aerospace engineer I have zero training on procedures such as proper software testing or managing pypi packages. Could you be so kind as to point out some resources that could get me started?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants