Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataPipe] Fix FullSync shutdown hanging issue while paused #1153

Closed
wants to merge 4 commits into from

Conversation

NivekT
Copy link
Contributor

@NivekT NivekT commented May 5, 2023

Stack from ghstack:

Before this PR, the executor within FullSync fails to shutdown if it were currently paused. This PR allows shutdown without submitting additional jobs.

Differential Revision: D45610885

NivekT added a commit that referenced this pull request May 5, 2023
ghstack-source-id: 098b11f30792ceca6b68b93fd98d091a5e6cad19
Pull Request resolved: #1153
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 5, 2023
@NivekT NivekT requested a review from ejguan May 5, 2023 16:49
@NivekT
Copy link
Contributor Author

NivekT commented May 5, 2023

@NivekT has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

dp3.pause()
it2 = iter(dp3) # Reset
next(it2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test failing without the patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for DataLoader2 with fullsync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it fails without the patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really add a DataLoader2 test because DistributedRS currently doesn't support pause....

I will have to add that separately. Let me know if I should land this as it is or add that on top of this.

NivekT added a commit that referenced this pull request May 5, 2023
ghstack-source-id: e8353cec63f57df560d6afdc08012ddf1dd572c5
Pull Request resolved: #1153
@NivekT
Copy link
Contributor Author

NivekT commented May 5, 2023

@NivekT has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Before this PR, the executor within FullSync fails to shutdown if it were currently paused. This PR allows shutdown without submitting additional jobs.

Differential Revision: [D45610885](https://our.internmc.facebook.com/intern/diff/D45610885)

[ghstack-poisoned]
NivekT added a commit that referenced this pull request May 5, 2023
ghstack-source-id: f840643e4276fcccb1b7c8b36094e4848a6f0de2
Pull Request resolved: #1153
@NivekT
Copy link
Contributor Author

NivekT commented May 5, 2023

@NivekT has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Before this PR, the executor within FullSync fails to shutdown if it were currently paused. This PR allows shutdown without submitting additional jobs.

Differential Revision: [D45610885](https://our.internmc.facebook.com/intern/diff/D45610885)

[ghstack-poisoned]
NivekT added a commit that referenced this pull request May 12, 2023
ghstack-source-id: 723f7d11ac21b818715cf98d00d5d24d55be4301
Pull Request resolved: #1153
@NivekT
Copy link
Contributor Author

NivekT commented May 12, 2023

@NivekT has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@NivekT merged this pull request in 0b39117.

@facebook-github-bot facebook-github-bot deleted the gh/NivekT/123/head branch May 19, 2023 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants