-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataLoader2] Adding support for naive snapshotting #915
Conversation
[ghstack-poisoned]
ghstack-source-id: c2ca295db6cc0625cdb1fce57edc9ee46434551e Pull Request resolved: #915
self.datapipe = self.reading_service.initialize(self.datapipe) | ||
self._adapted = True | ||
self.reading_service._restore_naive_datapipe_snapshot(n_samples_yielded, initial_seed) | ||
# TODO: I might want to skip `initialize_iteration` after this???? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: long term we probably want to have a seed generator in DataLoader2
@@ -306,6 +308,7 @@ def initialize_iteration(self) -> None: | |||
shared_seed_int: int = shared_seed.item() # type: ignore[assignment] | |||
_seed_generator = torch.Generator() | |||
_seed_generator.manual_seed(shared_seed_int) | |||
self._initial_seed = shared_seed_int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I should get original state, prior to generate_random_scalar_tensor
I should get PyTorch global random state.
When I restore, my restoration function should restore that state prior to calling initialize_iteration
(probably by iter(dp2)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think about what user will call (probably iter(dl2)
), and how that will trigger various initialization methods
[ghstack-poisoned]
ghstack-source-id: 7d8e52409d71940ce7fdc5799da6e668524a2351 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 216780ad25d4b29d395d99d0283c70db7dd73f72 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 5a70ee1b1b792e148e0bd9503ee7eeaf15f3f128 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: f4865bc68e37c825b0bde4905507ab7d1e08f31d Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 6e50986c281cd4648c419216a893f013ea22f556 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 53ac5e56bcdd2309665a061a70ae74487e4b46d2 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: d5c7c1babb93409a78741a4f34bcf801ff7fc034 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 10d4213a58034d73e92e334e97bf396f3a3cadf2 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: b4e60ceccc657b5c78b7db143679d76b1121fe31 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 713a9b208cf7596f0f81c305396e4c27ef51e950 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: ad7ba39d91fe374653d8c4c313db99cabb127da4 Pull Request resolved: #915
[ghstack-poisoned]
[ghstack-poisoned]
ghstack-source-id: 3bfe854c6077fec39b82606b1f49f8b199ead97b Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 0c975a9cfc3f1242c88a65f88d7b1c5360b1620b Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: ad4f5ebbee3ec1e87d3b2b12a206f9d9ec70aa1e Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: 066a960231fa1faa463269c535d16657fe16242a Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: baa71479d331a8a1fa7c24d9ae5e650e0175adc6 Pull Request resolved: #915
[ghstack-poisoned]
ghstack-source-id: d4aa3f49a551c5096079208ccf518f5f767b0321 Pull Request resolved: #915
@NivekT has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Differential Revision: [D43644875](https://our.internmc.facebook.com/intern/diff/D43644875) [ghstack-poisoned]
ghstack-source-id: 50a724171838203b726a28630cae81d37e390dcb Pull Request resolved: #915
Differential Revision: [D43644875](https://our.internmc.facebook.com/intern/diff/D43644875) [ghstack-poisoned]
ghstack-source-id: b4fc8fdbdcd40f52280603a29df82d20b8061de5 Pull Request resolved: #915
Differential Revision: [D43644875](https://our.internmc.facebook.com/intern/diff/D43644875) [ghstack-poisoned]
ghstack-source-id: 695f62ca5b7084af4001c3d8a21cb0554aa13527 Pull Request resolved: #915
Differential Revision: [D43644875](https://our.internmc.facebook.com/intern/diff/D43644875) [ghstack-poisoned]
ghstack-source-id: c9c0fa4e366f40fd7d13591a0b7ae9e3b1d40f51 Pull Request resolved: #915
Differential Revision: [D43644875](https://our.internmc.facebook.com/intern/diff/D43644875) [ghstack-poisoned]
ghstack-source-id: e771ff141444d5a4fd18ce48efd0a77d0091652a Pull Request resolved: #915
Hi @NivekT! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
Closing. Feel free to re-open if someone else would like to work on this. |
Stack from ghstack:
restore_iteration
to ReadingService method for arbitrary checkpointing #1056Differential Revision: D43644875