Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pin_memory as a DataPipe #1013

Closed
ejguan opened this issue Feb 14, 2023 · 0 comments
Closed

Add pin_memory as a DataPipe #1013

ejguan opened this issue Feb 14, 2023 · 0 comments

Comments

@ejguan
Copy link
Contributor

ejguan commented Feb 14, 2023

🚀 The feature

In previous DataLoader, we relies on the argument of pin_memory to launch a thread to move Tensor from CPU to GPU shared memory. This feature should be implemented as a DataPipe with is_replicable() -> False to keep it in the main process.

This should be easily achieved by doing the similar thing as prefetch with buffer size 1.

Motivation, pitch

Feature parity with DataLoader
This DataPipe can also become an indicator that the subsequent operations are on GPU.

Alternatives

No response

Additional context

No response

ejguan added a commit to ejguan/data that referenced this issue Feb 21, 2023
Summary:
Fixes pytorch#1013

## Changes

- Simplify the control flow of prefetcher
  - Delay Exception raised from thread worker to main thread in `__iter__`
  - Stop prefetching whenever Exception is received
  - As long as `stop_iteration` is not turned on or `buffer` is not empty, continue yielding data from `__iter__`.
  - Add serialization test
- Add `PinMemory` DataPipe
  -  `is_replciable() -> False` to keep it in the main process
  - Add unit tests
- Update `test_proto_multi_rs.py` to `test_mprs.py`

Pull Request resolved: pytorch#1014

Reviewed By: NivekT

Differential Revision: D43329696

Pulled By: ejguan

fbshipit-source-id: da4326dbe2388f4e23b9a1a3a5c43da09d29185a
ejguan added a commit that referenced this issue Feb 21, 2023
Summary:
Fixes #1013

## Changes

- Simplify the control flow of prefetcher
  - Delay Exception raised from thread worker to main thread in `__iter__`
  - Stop prefetching whenever Exception is received
  - As long as `stop_iteration` is not turned on or `buffer` is not empty, continue yielding data from `__iter__`.
  - Add serialization test
- Add `PinMemory` DataPipe
  -  `is_replciable() -> False` to keep it in the main process
  - Add unit tests
- Update `test_proto_multi_rs.py` to `test_mprs.py`

Pull Request resolved: #1014

Reviewed By: NivekT

Differential Revision: D43329696

Pulled By: ejguan

fbshipit-source-id: da4326dbe2388f4e23b9a1a3a5c43da09d29185a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant