Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Prefetcher and Implement PinMemory IterDataPipe #1014

Closed
wants to merge 8 commits into from

Conversation

ejguan
Copy link
Contributor

@ejguan ejguan commented Feb 15, 2023

Fixes #1013

Changes

  • Simplify the control flow of prefetcher
    • Delay Exception raised from thread worker to main thread in __iter__
    • Stop prefetching whenever Exception is received
    • As long as stop_iteration is not turned on or buffer is not empty, continue yielding data from __iter__.
    • Add serialization test
  • Add PinMemory DataPipe
    • is_replciable() -> False to keep it in the main process
    • Add unit tests
  • Update test_proto_multi_rs.py to test_mprs.py

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 15, 2023
@ejguan ejguan marked this pull request as draft February 15, 2023 14:34
@ejguan ejguan changed the title Implement PinMemory IterDataPipe Update Prefetcher and Implement PinMemory IterDataPipe Feb 15, 2023
@ejguan ejguan marked this pull request as ready for review February 15, 2023 21:55
@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ejguan
Copy link
Contributor Author

ejguan commented Feb 15, 2023

I need to rebase onto #1003 after it's landed.

Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM with a few notes/questions. The simplification of Prefetcher is great!

Also, I don't think our CI runs with CUDA, can you confirm?

torchdata/datapipes/iter/util/prefetcher.py Show resolved Hide resolved
torchdata/datapipes/iter/util/prefetcher.py Outdated Show resolved Hide resolved
torchdata/datapipes/iter/util/prefetcher.py Outdated Show resolved Hide resolved
torchdata/datapipes/iter/util/prefetcher.py Outdated Show resolved Hide resolved
torchdata/datapipes/iter/util/prefetcher.py Show resolved Hide resolved
torchdata/datapipes/iter/util/prefetcher.py Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ejguan ejguan requested a review from NivekT February 16, 2023 16:30
Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

import torch


def pin_memory_fn(data, device=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: A short docstring and add this to torchdata.datapipes.utils.rst will be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@facebook-github-bot
Copy link
Contributor

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@NivekT NivekT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to mention pin memory and what it does to users in tutorial at some point, but this is beyond the scope of this PR.

@facebook-github-bot
Copy link
Contributor

@ejguan merged this pull request in a3b34a0.

@NivekT NivekT mentioned this pull request Feb 17, 2023
10 tasks
ejguan added a commit to ejguan/data that referenced this pull request Feb 21, 2023
Summary:
Fixes pytorch#1013

## Changes

- Simplify the control flow of prefetcher
  - Delay Exception raised from thread worker to main thread in `__iter__`
  - Stop prefetching whenever Exception is received
  - As long as `stop_iteration` is not turned on or `buffer` is not empty, continue yielding data from `__iter__`.
  - Add serialization test
- Add `PinMemory` DataPipe
  -  `is_replciable() -> False` to keep it in the main process
  - Add unit tests
- Update `test_proto_multi_rs.py` to `test_mprs.py`

Pull Request resolved: pytorch#1014

Reviewed By: NivekT

Differential Revision: D43329696

Pulled By: ejguan

fbshipit-source-id: da4326dbe2388f4e23b9a1a3a5c43da09d29185a
ejguan added a commit that referenced this pull request Feb 21, 2023
Summary:
Fixes #1013

## Changes

- Simplify the control flow of prefetcher
  - Delay Exception raised from thread worker to main thread in `__iter__`
  - Stop prefetching whenever Exception is received
  - As long as `stop_iteration` is not turned on or `buffer` is not empty, continue yielding data from `__iter__`.
  - Add serialization test
- Add `PinMemory` DataPipe
  -  `is_replciable() -> False` to keep it in the main process
  - Add unit tests
- Update `test_proto_multi_rs.py` to `test_mprs.py`

Pull Request resolved: #1014

Reviewed By: NivekT

Differential Revision: D43329696

Pulled By: ejguan

fbshipit-source-id: da4326dbe2388f4e23b9a1a3a5c43da09d29185a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pin_memory as a DataPipe
3 participants