You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
self.num_training_batches is defined using int here, which rounds it down to 0 when a small training_percent_check or overfit_pct is used, even though at least 1 batch is still processed.
This does not cause any errors in "vanilla" lightning, but crashes any user code that uses the number of batches in a division (for example to get an average of some quantity over batches).
To Reproduce
Steps to reproduce the behavior:
Set the training percentage to a small enough percentage that the number of examples is smaller than the batch size for a given dataset.
This would require a very simple fix, either to use math.ceil() or max(1, self.num_training_batches), depending of how the quantity is expected to behave in the rest of the code.
The text was updated successfully, but these errors were encountered:
I think max(1, self.num_training_batches) would be the best because it doesn't change the behaviour for existing code that has num_training_batches >= 1. I'm sure the PL team would appreciate if you made a PR. Is the bug also present for validation and test batches?
Well, actually it is not the rounding issue. The problem is that now we process num_training_batches + 1 batches instead of num_training_batches if num_training_batches < len(train_dataloader).
🐛 Bug
self.num_training_batches is defined using int here, which rounds it down to 0 when a small training_percent_check or overfit_pct is used, even though at least 1 batch is still processed.
This does not cause any errors in "vanilla" lightning, but crashes any user code that uses the number of batches in a division (for example to get an average of some quantity over batches).
To Reproduce
Steps to reproduce the behavior:
Set the training percentage to a small enough percentage that the number of examples is smaller than the batch size for a given dataset.
This would require a very simple fix, either to use
math.ceil()
ormax(1, self.num_training_batches)
, depending of how the quantity is expected to behave in the rest of the code.The text was updated successfully, but these errors were encountered: