-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resume #17
Comments
Some more comments: What I’d really like is something like what ncftp has, which checks to see if I’ve already got a copy of each file in my download directory and only actually downloads if I don’t. (Or alternatively, I’m okay with overwriting identical existing files – it’s just electrons to me; you’re the ones paying for the big database and transfer cost!) |
Related, we want to update the data subscriber to not re-download granules if the UMM-G revision date changes but the data content itself does not change. We can achieve this by looking at the checksum in CMR UMM-G Keep in mind the checksum can be either md5 or sha512. https://podaac.jpl.nasa.gov/Tutorial-Discovering-Data-File-Checksums-for-Cloud-based-Data |
Rather than |
We also want a |
Prevents re-downloading files (e.g. in case previous run failed because of other file failures). If the subscriber sees a file already exists, it will also calculate the file checksum and see if it matches the checksum in CMR. If the checcksum doesn't match, it will re-download. There is now a --force/-f option that will cause subscriber to re-download even if the file exists and is up to date. Issue #17
* Change print statements to log statements * Fix flake errors * Add retry logic for 500 and 401 errors from CMR * Subscriber check if file exists before downloading Prevents re-downloading files (e.g. in case previous run failed because of other file failures). If the subscriber sees a file already exists, it will also calculate the file checksum and see if it matches the checksum in CMR. If the checcksum doesn't match, it will re-download. There is now a --force/-f option that will cause subscriber to re-download even if the file exists and is up to date. Issue #17 * Issues/15 (#65) * updated get_search to include verbose option, not entire 'args' option * added search after functionality to podaac access; removed scroll from initial parameters * updated changelog * closes #15 * Update python-app.yml added netrc creation for future use of regression tests. * Add checks for pre-existing files to downloader (#67) * Check if file exists before download - downloader * Update documentation Co-authored-by: Wilbert Veit <wilbert.e.veit@jpl.nasa.gov> * Programmatic Regression Testing (#66) * added programmatice regression testing. currently relies on a valid .netrc file, refactoring might be needed to manually add a user/password to the CMR/TEA downloads * Update python-app.yml * updated regression tests, readied 1.9.0 version * added -f option test to downloader regression * Update python-app.yml Co-authored-by: Joe Sapp <joe.sapp@noaa.gov> Co-authored-by: mgangl <mike.gangl@gmail.com> Co-authored-by: Frank Greguska <Francis.Greguska@jpl.nasa.gov> Co-authored-by: Wilbert Veit <wilbert.e.veit@jpl.nasa.gov> Co-authored-by: Wilbert Veit <wilbertveit@rocketmail.com>
'resume' now works by comparing checksums of existing files with search results- so no re-download occurs. |
https://podaac.jpl.nasa.gov/forum/viewtopic.php?f=6&t=1418
Subscriber should be able to 'resume' during a download failure. Currently, if any of the downloads fail during a subscriber run, the subscriber "exits" without updating its last run, and the next time it runs, it will attempt to download all files from the previous, "failed" run, even if only one out of N files actually failed.
The text was updated successfully, but these errors were encountered: