-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor away from the .update file #174
Comments
it's been a while since i worked on this, but wanted to confirm- is this change only for the "downloader" tool, or is it for the |
Reworked the ticket to something I think is more workable for subscriber specifically. Lmk your thoughts |
Issue
The current solution for maintaining the latest timestamp within a directory is via the
.update
hidden file. While this works, the solution is not portable or self evident to users.Solution
Refactor data-subscriber to instead utilize file metadata within the directory to determine the next start datetime to fetch from. This solution removes the need to maintain a .update file which may disappear if the user copies the granules from one directory to another without noticing the .update file. Potential issues that may arise is if the user is utilizing the directory for other work and adding additional files after subscriber runs or if the user is subscribing to multiple granules into the same directory.
An alternative solution may be to perform granule downloads in descending order of timestamps such that any granule that's not found already in the directory is downloaded, but once the subscriber hits a granule that does exist (implying that was the last stop point), it ends its execution. This solution would skip the need to look for file metadata which may change unbeknownst to the user and may be inconsistent across filesystems. It would also enable support for subscribing to multiple datasets within the same directory.
The text was updated successfully, but these errors were encountered: