Replies: 1 comment
-
Hi @JJumSSu. Re. 1, all shards are shuffled. Re. 2, the advantage here is that it allows us to save checkpoints more frequently (at fractions of an epoch) by setting --train-num-samples to a lower value. This is important for larger datasets |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
thank you for the amazing repo!
I'm currently trying to train a CLIP model using multiple datasets in a webdataset format.
While doing so, I have some questions regarding the shuffling.
--dataset-resampled
shuffles the shards with replacement. So does it mean that some instances will be trained more than one time and some of them will not be trained at all? If so, what is the advantage of using the parameter?Thank you :)
Beta Was this translation helpful? Give feedback.
All reactions