[WIP] fix: updates the training sampling strategy to complete the last batch #538
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #438
PR Goal?
Updates sampling strategy in training and complete last batches with random samples from other batches instead of dropping last batches.
Fixes?
Fixes #438
Feedback sought?
If a model works with this new sampler. If it produces results that are better or at least not worse after training.
Priority?
Low
Tests added?
No tests added, but it would be good to have some testing of this sampler.
How to test?
Place a breakpoint and inspect the composition of a last batch in an epoch. Check that the number of batches correspond to expectations during training. Train a model in a scenario when difference between dropping and keeping last batches is noticeable (e.g. very small dataset or a dataset where samples in last batch have unique phonemes).
Confidence?
Low. This code wasn't properly tested.
Version change?
No. Can be a part of a larger update.
Related PRs?
No.