Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data-subscriber fails when dataset has no real start date and end date #40

Closed
ifenty opened this issue Jan 27, 2022 · 3 comments
Closed
Assignees
Milestone

Comments

@ifenty
Copy link

ifenty commented Jan 27, 2022

An ECCO dataset of ancillary files has no natural 'start date' and 'end date'. users shouldn't be required to specify them to download.

$ podaac-data-subscriber -c ECCO_L4_ANCILLARY_DATA_V4R4  -d anc
WARN: No .update in the data directory. (Is this the first run?)
Downloaded: 0 files

Files Failed to download:0

CMR token successfully deleted

Oh, it also fails when I specify a start date and end date that spans the entire ECCO period:

$ podaac-data-subscriber -c ECCO_L4_ANCILLARY_DATA_V4R4  -d anc -sd 1990-01-01T00:00:00Z -ed 2021-01-01T01:01:01Z
NOTE: .update found in the data directory. (The last run was at 2022-01-27T20:07:48Z.)
Downloaded: 0 files

Files Failed to download:0

CMR token successfully deleted

@mike-gangl
Copy link
Contributor

This actually looks like a different bug- you seem to have an existing ".update" file in your ./anc directory, which it's using to say when it last retrieved data. i wonder if you delete the .update file in that ./anc directory, and re-run with the start/end date will you get results? I think this goes back to using the same download directory for multiple runs (a totally valid use case we didn't develop for)

I'd also need to investigate if the suffix of those ancillary files are in the default suffix list --extensions

-e EXTENSIONS, --extensions EXTENSIONS
                       The extensions of products to download. Default is [.nc, .h5, .zip]

@ifenty
Copy link
Author

ifenty commented Jan 27, 2022

unfortunately not. even after deleting the anc directory and re-running I get the same results.

$rm -fr anc*
$podaac-data-subscriber -c ECCO_L4_ANCILLARY_DATA_V4R4  -d anc -sd 1990-01-01T00:00:00Z -ed 2021-01-01T01:01:01Z
NOTE: Making new data directory at anc(This is the first run.)
Downloaded: 0 files

Files Failed to download:0

CMR token successfully deleted

here is with an entirely new directory ancfoo:

$ podaac-data-subscriber -c ECCO_L4_ANCILLARY_DATA_V4R4  -d ancfoo -sd 1990-01-01T00:00:00Z -ed 2021-01-01T01:01:01Z
NOTE: Making new data directory at ancfoo(This is the first run.)
Downloaded: 0 files

Files Failed to download:0

CMR token successfully deleted

@mike-gangl
Copy link
Contributor

thanks, i did a quick look and the files in that collection are of type 'tar.gz':

Granule Listing:
https://cmr.earthdata.nasa.gov/search/granules.json?echo_collection_id=C2096684707-POCLOUD

"href": "s3://podaac-ops-cumulus-protected/ECCO_L4_ANCILLARY_DATA_V4R4/ancillary_data_output_insitu_ECCO_V4r4.tar.gz"

We'll add the 'tar.gz' as a default type to the --extension parameter.

mike-gangl added a commit that referenced this issue Feb 2, 2022
closes #39
closes #40
@mike-gangl mike-gangl added this to the 1.8.0 milestone Feb 2, 2022
@mike-gangl mike-gangl self-assigned this Feb 2, 2022
mike-gangl added a commit that referenced this issue Feb 2, 2022
* added pypi release badge to README

* began separating out functionality for data downloader

* 1.7.2 merge into develop (#43)

* Don't consume all arguments after --extensions

This behavior is now more like other utilities where specifying the flag
multiple times extends the value of the argument.  For example,
    -e '.nc .h5 .zip'
becomes
    -e '.nc' -e '.h5' -e '.zip'

This is less fragile for the user and possibly less confusing how the
argument should be formatted on the command line.

* Add ability to execute arbitrary commands on each downloaded file

I did it this way so each file could be compressed without hard-coding
the compression algorithm. But I could see this being used to run a
pre-processing script on each downloaded file.

* updated README and tests for additive -e examples

* force 'action'

* merged code for extensions, process call, and updated documentation

* fix for #28

* updated CHANGELOG

* Issue 33 (#35)

* Develop (#32)

* Don't consume all arguments after --extensions

This behavior is now more like other utilities where specifying the flag
multiple times extends the value of the argument.  For example,
    -e '.nc .h5 .zip'
becomes
    -e '.nc' -e '.h5' -e '.zip'

This is less fragile for the user and possibly less confusing how the
argument should be formatted on the command line.

* Add ability to execute arbitrary commands on each downloaded file

I did it this way so each file could be compressed without hard-coding
the compression algorithm. But I could see this being used to run a
pre-processing script on each downloaded file.

* updated README and tests for additive -e examples

* force 'action'

* merged code for extensions, process call, and updated documentation

* fix for #28

* updated CHANGELOG

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* closes #33
added 'files to download' to non-verbose output

* updated changelog

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* updated changelog to reflect accurate versions released

* version bump for master delivery

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* fixed tests

* removed erroneous printouts for pytest fixes

* fixed some bugs
closes #39
closes #40

* Update README.md

* updates for flake8

* fixed issue with time_offset variables, added manual tests

* flake8 updates for access, subscriber

* added downloader tests, flake8 updates for download client

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>
mike-gangl added a commit that referenced this issue Feb 28, 2022
* Don't consume all arguments after --extensions

This behavior is now more like other utilities where specifying the flag
multiple times extends the value of the argument.  For example,
    -e '.nc .h5 .zip'
becomes
    -e '.nc' -e '.h5' -e '.zip'

This is less fragile for the user and possibly less confusing how the
argument should be formatted on the command line.

* Add ability to execute arbitrary commands on each downloaded file

I did it this way so each file could be compressed without hard-coding
the compression algorithm. But I could see this being used to run a
pre-processing script on each downloaded file.

* updated README and tests for additive -e examples

* force 'action'

* merged code for extensions, process call, and updated documentation

* fix for #28

* updated CHANGELOG

* Issue 33 (#35)

* Develop (#32)

* Don't consume all arguments after --extensions

This behavior is now more like other utilities where specifying the flag
multiple times extends the value of the argument.  For example,
    -e '.nc .h5 .zip'
becomes
    -e '.nc' -e '.h5' -e '.zip'

This is less fragile for the user and possibly less confusing how the
argument should be formatted on the command line.

* Add ability to execute arbitrary commands on each downloaded file

I did it this way so each file could be compressed without hard-coding
the compression algorithm. But I could see this being used to run a
pre-processing script on each downloaded file.

* updated README and tests for additive -e examples

* force 'action'

* merged code for extensions, process call, and updated documentation

* fix for #28

* updated CHANGELOG

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* closes #33
added 'files to download' to non-verbose output

* updated changelog

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* updated changelog to reflect accurate versions released

* version bump for master delivery

* Bulk download capability (#45)

* added pypi release badge to README

* began separating out functionality for data downloader

* 1.7.2 merge into develop (#43)

* Don't consume all arguments after --extensions

This behavior is now more like other utilities where specifying the flag
multiple times extends the value of the argument.  For example,
    -e '.nc .h5 .zip'
becomes
    -e '.nc' -e '.h5' -e '.zip'

This is less fragile for the user and possibly less confusing how the
argument should be formatted on the command line.

* Add ability to execute arbitrary commands on each downloaded file

I did it this way so each file could be compressed without hard-coding
the compression algorithm. But I could see this being used to run a
pre-processing script on each downloaded file.

* updated README and tests for additive -e examples

* force 'action'

* merged code for extensions, process call, and updated documentation

* fix for #28

* updated CHANGELOG

* Issue 33 (#35)

* Develop (#32)

* Don't consume all arguments after --extensions

This behavior is now more like other utilities where specifying the flag
multiple times extends the value of the argument.  For example,
    -e '.nc .h5 .zip'
becomes
    -e '.nc' -e '.h5' -e '.zip'

This is less fragile for the user and possibly less confusing how the
argument should be formatted on the command line.

* Add ability to execute arbitrary commands on each downloaded file

I did it this way so each file could be compressed without hard-coding
the compression algorithm. But I could see this being used to run a
pre-processing script on each downloaded file.

* updated README and tests for additive -e examples

* force 'action'

* merged code for extensions, process call, and updated documentation

* fix for #28

* updated CHANGELOG

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* closes #33
added 'files to download' to non-verbose output

* updated changelog

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* updated changelog to reflect accurate versions released

* version bump for master delivery

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* fixed tests

* removed erroneous printouts for pytest fixes

* fixed some bugs
closes #39
closes #40

* Update README.md

* updates for flake8

* fixed issue with time_offset variables, added manual tests

* flake8 updates for access, subscriber

* added downloader tests, flake8 updates for download client

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>

* updated tests

* fixed path issue for built python command line tooling

* Ignore error if directory already exists (#47)

* Ignore error if directory already exists

Fixes #46

* issues/46: Ignore errors if destination directory already exists.

Co-authored-by: Frank Greguska <Francis.Greguska@jpl.nasa.gov>

* Issues/44 (#48)

* Added functionality to name .update file to .update__<COLLECTION> so that the same directory can be re-used.
Added tests to fix this functionality
closes #44

* flake8 fixes

* added updates to .update file

* Issues/41 (#50)

* added cycle based downloads

* udpated changelog

* 180 docs (#51)

* added changelog info for 1.80, created initial markdown files

* added subsriber/downloader links

* added documentation

* Added downloader/subscriber specific docs.
fixed some issues with imports

* Update README.md

Co-authored-by: Joe Sapp <joe.sapp@noaa.gov>
Co-authored-by: mgangl <mike.gangl@gmail.com>
Co-authored-by: Frank Greguska <89428916+frankinspace@users.noreply.github.com>
Co-authored-by: Frank Greguska <Francis.Greguska@jpl.nasa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants