Make `TaskLoader` a generator(?) #24

tom-andersson · 2023-07-08T16:03:10Z

In conventional DL training interfaces, the data loader is typically a generator object, where iterating over it returns batches of data to pass to the model.

We could make the TaskLoader a generator to adhere to this convention. However, my main issue with this is that there is an enormous amount of flexibility in the TaskLoader.__call__ method. This reflects the flexibility of NPs as probabilistic models that can take any data as context and any data as target, resulting in a variety of ways you might want to sample your raw data to generate Tasks for training. This then begs the question of how next(task_loader) should sample the xarray/pandas dataset objects to produce the context and target data for the Tasks, if the user has not specified this explicitly. What date should be sliced and what sampling strategy should be used for the context/target data?

One option would be to set TaskLoader attributes like a list of train_dates that will be looped over for generating Tasks, plus additional information on the context_sampling and target_sampling strategies. Or, context_sampling and target_sampling and additional TaskLoader.__call__ kwargs could be passed at generation time.

IMO it is safer to have the user explicitly passing and controlling these sampling options by directly calling the TaskLoader.__call__ method to generate batches of Task objects for training. However, if there is a clear benefit for being able to loop over a TaskLoader and a clean way to implement it, then this is worth considering. I'm open to discussion on this.

cc @jonas-scholz123

The text was updated successfully, but these errors were encountered:

tom-andersson added the help wanted Extra attention is needed label Jul 8, 2023

tom-andersson added thoughts welcome Discussion and feedback is appreciated and removed help wanted Extra attention is needed labels Aug 23, 2023

tom-andersson mentioned this issue Sep 1, 2023

DeepSensor with Pytorch Lightning #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `TaskLoader` a generator(?) #24

Make `TaskLoader` a generator(?) #24

tom-andersson commented Jul 8, 2023

Make TaskLoader a generator(?) #24

Make TaskLoader a generator(?) #24

Comments

tom-andersson commented Jul 8, 2023

Make `TaskLoader` a generator(?) #24

Make `TaskLoader` a generator(?) #24