You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To ingest new data sources, and to ingest local files for debugging, we have to create a dedicated python file to define where the source data is coming from. It would be nice and friendlier if these could be passed as parameters to the downloader. This way, we could easily ingest local or remote data from test sources. We could also potentially replace data sources with text configuration files instead of python classes
For example, instead of needing to create a NewImageSource.py and update daemon.py to acknowledge it, I should be able to just do:
And the downloader.py should be able to read the folder and perform all the same logic as if it read one of the server definitions. Server definitions must still be supported, since some of them (like iris) need to do some computation, but most just look at a list of known folders and get the jp2 files.
Server definitions provide 4 features:
URL to source data, whether local or remote
List of folders to access, computed via a start and end date
Format of the date within the jp2 file name
Time to wait between checking for new images
I would say 1, 3, and 4 are fairly constant tasks that can be in a configuration file. Getting the list of folders to access typically just creates a list of days in the format "Year/month/day/" for each day in the given range. So a configuration file could provide a list of "known folder names" and the implementation could automatically prepend the "Year/month/day"
Iris is a special case, because it doesn't follow the year/month/day format, so the server definition queries the remote source to get a list of folders, and then queries the remote folders to get the list of jp2 files. Since this is a required use-case, we can't completely get rid of server definitions, but I think we can simplify most of them.
I'm writing this up just as an idea that we can think about.
The text was updated successfully, but these errors were encountered:
To ingest new data sources, and to ingest local files for debugging, we have to create a dedicated python file to define where the source data is coming from. It would be nice and friendlier if these could be passed as parameters to the downloader. This way, we could easily ingest local or remote data from test sources. We could also potentially replace data sources with text configuration files instead of python classes
For example, instead of needing to create a
NewImageSource.py
and updatedaemon.py
to acknowledge it, I should be able to just do:And the downloader.py should be able to read the folder and perform all the same logic as if it read one of the server definitions. Server definitions must still be supported, since some of them (like iris) need to do some computation, but most just look at a list of known folders and get the jp2 files.
Server definitions provide 4 features:
I would say 1, 3, and 4 are fairly constant tasks that can be in a configuration file. Getting the list of folders to access typically just creates a list of days in the format "Year/month/day/" for each day in the given range. So a configuration file could provide a list of "known folder names" and the implementation could automatically prepend the "Year/month/day"
Iris is a special case, because it doesn't follow the year/month/day format, so the server definition queries the remote source to get a list of folders, and then queries the remote folders to get the list of jp2 files. Since this is a required use-case, we can't completely get rid of server definitions, but I think we can simplify most of them.
I'm writing this up just as an idea that we can think about.
The text was updated successfully, but these errors were encountered: