-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Prefetcher into the PrototypeRS and fixing prefetcher bug #826
Adding Prefetcher into the PrototypeRS and fixing prefetcher bug #826
Conversation
[ghstack-poisoned]
ghstack-source-id: b80672b44f488b02a5cf8625e2229c5ef9f25a5a Pull Request resolved: #826
@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The culprit was the same as I was thinking. But, I do have a few comments below
|
||
def reset_iterator(self): | ||
self.reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I found the same issue as reset
will be called automatically at the beginning of __iter__
.
self.combined_datapipes.reset_epoch() | ||
if self.prefetch_mainloop > 0: | ||
# Stop prefetching first | ||
self.combined_datapipes.reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since reset
will be called in the begining of the __iter__
as well, I think it's better to add a different API to stop prefetching and called here.
And, reset
can be the last resort to stop thread if that API is not invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, we should have a graph function to stop prefetcher
from the end of the pipeline to the beginning especially when multiple prefetcher
s are presented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found them annoyingly duplicating and ideally we need something like GraphStop operation to freeze all parallel work. It would be extremely useful for snapshotting. But it is out of scope for this PR (Anyway I will post separate issue to capture demand)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM! There are a few special DataPipes need to be handled specially with this GraphStop
:
FullSync
Prefetcher
…er bug" Fixes: #815 Fixes #825 Differential Revision: [D40308358](https://our.internmc.facebook.com/intern/diff/D40308358) [ghstack-poisoned]
ghstack-source-id: bab358a21b83d018a81ea3027ba41d0d568671cd Pull Request resolved: #826
@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…orch#826) Summary: Pull Request resolved: pytorch#826 Test Plan: Imported from OSS Reviewed By: NivekT, msaroufim Differential Revision: D40308358 Pulled By: VitalyFedyunin fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
Summary: Pull Request resolved: #826 Test Plan: Imported from OSS Reviewed By: NivekT, msaroufim Differential Revision: D40308358 Pulled By: VitalyFedyunin fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
…orch#826) Summary: Pull Request resolved: pytorch#826 Test Plan: Imported from OSS Reviewed By: NivekT, msaroufim Differential Revision: D40308358 Pulled By: VitalyFedyunin fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
Fixes: #815
Stack from ghstack (oldest at bottom):
Fixes #825
Differential Revision: D40308358