Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable a "generic" download server #371

Open
dgarciabriseno opened this issue Apr 16, 2024 · 0 comments
Open

Enable a "generic" download server #371

dgarciabriseno opened this issue Apr 16, 2024 · 0 comments

Comments

@dgarciabriseno
Copy link
Contributor

To ingest new data sources, and to ingest local files for debugging, we have to create a dedicated python file to define where the source data is coming from. It would be nice and friendlier if these could be passed as parameters to the downloader. This way, we could easily ingest local or remote data from test sources. We could also potentially replace data sources with text configuration files instead of python classes

For example, instead of needing to create a NewImageSource.py and update daemon.py to acknowledge it, I should be able to just do:

downloader.py -d generic -m localmove --path "my_test_jp2_folder"

And the downloader.py should be able to read the folder and perform all the same logic as if it read one of the server definitions. Server definitions must still be supported, since some of them (like iris) need to do some computation, but most just look at a list of known folders and get the jp2 files.

Server definitions provide 4 features:

  1. URL to source data, whether local or remote
  2. List of folders to access, computed via a start and end date
  3. Format of the date within the jp2 file name
  4. Time to wait between checking for new images

I would say 1, 3, and 4 are fairly constant tasks that can be in a configuration file. Getting the list of folders to access typically just creates a list of days in the format "Year/month/day/" for each day in the given range. So a configuration file could provide a list of "known folder names" and the implementation could automatically prepend the "Year/month/day"

Iris is a special case, because it doesn't follow the year/month/day format, so the server definition queries the remote source to get a list of folders, and then queries the remote folders to get the list of jp2 files. Since this is a required use-case, we can't completely get rid of server definitions, but I think we can simplify most of them.

I'm writing this up just as an idea that we can think about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant