Description
🐛 Bug
self.num_training_batches is defined using int here, which rounds it down to 0 when a small training_percent_check or overfit_pct is used, even though at least 1 batch is still processed.
This does not cause any errors in "vanilla" lightning, but crashes any user code that uses the number of batches in a division (for example to get an average of some quantity over batches).
To Reproduce
Steps to reproduce the behavior:
Set the training percentage to a small enough percentage that the number of examples is smaller than the batch size for a given dataset.
This would require a very simple fix, either to use math.ceil()
or max(1, self.num_training_batches)
, depending of how the quantity is expected to behave in the rest of the code.