Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
more permanently fixes issues from #1127 dealing with VRAM instability. reverts an ordering change stemming from #1066 that was causing VRAM spikes during training. Updated dataset len calculations to use numpy instead of pandas since pandas can use the GPU and added a fallback calculation so we can still call it before dropping rows longer than the max length.
reverted order of callingprocess_datasets_for_packing
from #1141 which can cause issues when changing max sequence length and needing to re-preprocess the datasets.Motivation and Context
How has this been tested?
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)