[DataLoader2] Saving and restoring initial seed generator #1124
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack:
Reland of #998 with added guard while loading randomness state in
DataLoader2
for backward compatibilityChanges to
DataLoader2
:state_dict
to storerandomness_state
, which includes:_seed: int
_reset_seed: bool
- flag indicating whether_seed
needs to be set_seed_generator
- the latest version at the time whenstate_dict
is called_initial_seed_generator
- the versopm that is saved at the beginning of very epochfrom_state
andload_state_dict
to restorerandomness_state
_restore_checkpoint_beginning_of_epoch
self._seed_generator = self._initial_seed_generator
, allowing users to re-create an epoch from the beginning.Considerations
Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial.
I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default.
The basic idea is that we want to allow users to restore
dl2._seed_generator
to the previously saved version. From that point on, they can create a new__iter__
and continue from the beginning of the epoch._seed
and_reset_seed
are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint.seed
. Thatseed
will override any other behavior and theseed
will be used.Differential Revision: D44748514