Skip to content

Cycle option for StreamingDataLoader #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Aceticia opened this issue Mar 24, 2025 · 1 comment
Open

Cycle option for StreamingDataLoader #524

Aceticia opened this issue Mar 24, 2025 · 1 comment
Labels
enhancement New feature or request waiting on author Waiting for user input or feedback.

Comments

@Aceticia
Copy link

🚀 Feature

A function or an argument in StreamingDataLoader to cycle the passed in StreamingDataset.

Motivation

Many training scenarios in CV involve training models with multiple epochs, while wanting to control the exact number of steps being trained, independent of the underlying dataset size. E.g., given a CombinedStreamingDataset of some length, restart its iterations when it is exhausted.

Pitch

I'm not quite sure how this should be done - maybe in iter method of StreamingDataLoader, we can catch the final iteration and restart it?

@Aceticia Aceticia added the enhancement New feature or request label Mar 24, 2025
@tchaton
Copy link
Collaborator

tchaton commented Mar 26, 2025

You could check PyTorch Lightning Cycle Loaders: https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.utilities.combined_loader.html

Or create your own wrapper that iterates for a given number of steps.

@bhimrazy bhimrazy added the waiting on author Waiting for user input or feedback. label Mar 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request waiting on author Waiting for user input or feedback.
Projects
None yet
Development

No branches or pull requests

3 participants