Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Generalize pytorch content for non-native device execution #66
base: master
Are you sure you want to change the base?
[RFC] Generalize pytorch content for non-native device execution #66
Changes from all commits
a25d22b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea makes a lot of sense.
I think we would need more details and how this interracts with existing features like the device-generic tests (https://github.com/pytorch/pytorch/blob/main/torch/testing/_internal/common_device_type.py) that already work for privateuse1 partially btw and the opinfo consistency tests.
Also I don't think we want to aim at running the full test suite with the side device available but select specific device-dependent tests that need to be ran for each device we support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albanD : thanks for your comment , yes I believe we need some extensive hooks to enable this , for eg. the one we introduced with
https://github.com/pytorch/pytorch/pull/128584/files#diff-d183f2afc51d6a59bc70094e8f476d2468c45e415500f6eb60abad955e065156R531
@onlyNativeDeviceTypesAnd(["hpu"])
The other devices can add to to such list, if it supports the TC.
we can modify other hooks like skipIfDevice in similar fashion.
The common_device_type is useable if we replace the
onlyNativeDeviceType
decorator.It was widely used in the initial files but all recent files are not using it (eg: dynamo/distributed) and instead directly make
.cuda()
calls.However these content shouldn't be too difficult to migrate.
I believe in general we should ensure new test content uses the common_device_type framework, and open up the content for "non-native" device execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good list of items to generalize the test cases. Does the proposal just focus on the devices having dedicated device tags installed in PyTorch core or also support the PrivateUse1 device which is used to extend PyTorch with any out of the tree devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 : Thanks for your comment, Since i investigated mostly in lines to support intel Gaudi, which has dedicated device tag, i have not checked the impact or support needed for PrivateUse1 devices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps @FFFrog @Yikun have more thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 @ankurneog Sorry for the late reply.
In theory, in the test framework of PyTorch, dedicated keys are almost the same as public keys (PrivateUse1), and PrivateUse1 is already supported in the test framework.
First of all, I can`t agree more with this proposal, because Ascend NPU is currently facing the above-described problems; The solution proposed by @ankurneog can solve most of the problems we have encountered; We are currently sorting out all the problems encountered, and will add them to this RFC later, and hope that the new stuff we will add will help the proposal be more complete.
By the way, if possible, we can work together to complete this proposal and make it land in PyTorch :D