-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Vulkan][Codegen] Spir-V codegen, correct labels/blocks in WhileNode. #8013
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously, the WhileNode assumes that evaluating the loop condition will not introduce any additional labels. If this assumption is violated, such as for a WhileNode whose condition is an if/else statement, then the OpLoopMerge instruction appears in the wrong block. The unittest added exercises this code path, but doesn't yet trigger a failure. Once spvValidate is enabled for all vulkan codegen, then this unit test will catch the failure mode.
Potential reviewers: @masahi |
masahi
approved these changes
May 11, 2021
thanks @Lunderberg |
mehrdadh
pushed a commit
to mehrdadh/tvm
that referenced
this pull request
May 19, 2021
…apache#8013) Previously, the WhileNode assumes that evaluating the loop condition will not introduce any additional labels. If this assumption is violated, such as for a WhileNode whose condition is an if/else statement, then the OpLoopMerge instruction appears in the wrong block. The unittest added exercises this code path, but doesn't yet trigger a failure. Once spvValidate is enabled for all vulkan codegen, then this unit test will catch the failure mode. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
Jun 17, 2021
…apache#8013) Previously, the WhileNode assumes that evaluating the loop condition will not introduce any additional labels. If this assumption is violated, such as for a WhileNode whose condition is an if/else statement, then the OpLoopMerge instruction appears in the wrong block. The unittest added exercises this code path, but doesn't yet trigger a failure. Once spvValidate is enabled for all vulkan codegen, then this unit test will catch the failure mode. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
trevor-m
pushed a commit
to neo-ai/tvm
that referenced
this pull request
Jun 17, 2021
…apache#8013) Previously, the WhileNode assumes that evaluating the loop condition will not introduce any additional labels. If this assumption is violated, such as for a WhileNode whose condition is an if/else statement, then the OpLoopMerge instruction appears in the wrong block. The unittest added exercises this code path, but doesn't yet trigger a failure. Once spvValidate is enabled for all vulkan codegen, then this unit test will catch the failure mode. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
mbs-octoml
added a commit
to mbs-octoml/mbs-tvm
that referenced
this pull request
Sep 22, 2021
…_pass_infra.py This tutorial registers a global layout transformation for conv2d for all targets which is not well-formed. Later uses of conv2d in the tutorials pick that layout up then assert fail in the conv2d type-relation. Better would be to register a transform for an entirely fake target, but that is beyond my current level of expertise. In general our use of sphinx/sphinx_gallery for running and rendering the tutorials is highly suspect since there is no inter-example isolation: - Examples using tensorflow will gobble up GPU memory and not give it back. - Any examples which use any of the (many!) global registration mechanisms need to ensure the registrant is safe across all tutorials. I recall seeing a thread with the sphinx_gallery where they said they'd prefer not to work on process-level isolation, but it's probably worth pinging again. While digging into this I noticed we had a slicing cast in AlterOpLayout due to a derived class of ObjectRef introducing virtuals. I moved the virtuals to the corresponding Node classes. In this case we got away with it since the ObjectRef happened to not get copied but we were on very thin ice.
mbs-octoml
added a commit
to mbs-octoml/mbs-tvm
that referenced
this pull request
Sep 22, 2021
…_pass_infra.py This tutorial registers a global layout transformation for conv2d for all targets which is not well-formed. Later uses of conv2d in the tutorials pick that layout up then assert fail in the conv2d type-relation. Better would be to register a transform for an entirely fake target, but that is beyond my current level of expertise. In general our use of sphinx/sphinx_gallery for running and rendering the tutorials is highly suspect since there is no inter-example isolation: - Examples using tensorflow will gobble up GPU memory and not give it back. - Any examples which use any of the (many!) global registration mechanisms need to ensure the registrant is safe across all tutorials. I recall seeing a thread with the sphinx_gallery where they said they'd prefer not to work on process-level isolation, but it's probably worth pinging again. While digging into this I noticed we had a slicing cast in AlterOpLayout due to a derived class of ObjectRef introducing virtuals. I moved the virtuals to the corresponding Node classes. In this case we got away with it since the ObjectRef happened to not get copied but we were on very thin ice.
junrushao
pushed a commit
that referenced
this pull request
Sep 23, 2021
…infra.py (#9076) * BUG #8013: Remove register_alter_op_layout example from dev/use_pass_infra.py This tutorial registers a global layout transformation for conv2d for all targets which is not well-formed. Later uses of conv2d in the tutorials pick that layout up then assert fail in the conv2d type-relation. Better would be to register a transform for an entirely fake target, but that is beyond my current level of expertise. In general our use of sphinx/sphinx_gallery for running and rendering the tutorials is highly suspect since there is no inter-example isolation: - Examples using tensorflow will gobble up GPU memory and not give it back. - Any examples which use any of the (many!) global registration mechanisms need to ensure the registrant is safe across all tutorials. I recall seeing a thread with the sphinx_gallery where they said they'd prefer not to work on process-level isolation, but it's probably worth pinging again. While digging into this I noticed we had a slicing cast in AlterOpLayout due to a derived class of ObjectRef introducing virtuals. I moved the virtuals to the corresponding Node classes. In this case we got away with it since the ObjectRef happened to not get copied but we were on very thin ice. * [checkpoint] Woops, forgot there was an extra AlterOpLayout I should have run locally, there goes 6hrs of CI.
ylc
pushed a commit
to ylc/tvm
that referenced
this pull request
Sep 29, 2021
…_pass_infra.py (apache#9076) * BUG apache#8013: Remove register_alter_op_layout example from dev/use_pass_infra.py This tutorial registers a global layout transformation for conv2d for all targets which is not well-formed. Later uses of conv2d in the tutorials pick that layout up then assert fail in the conv2d type-relation. Better would be to register a transform for an entirely fake target, but that is beyond my current level of expertise. In general our use of sphinx/sphinx_gallery for running and rendering the tutorials is highly suspect since there is no inter-example isolation: - Examples using tensorflow will gobble up GPU memory and not give it back. - Any examples which use any of the (many!) global registration mechanisms need to ensure the registrant is safe across all tutorials. I recall seeing a thread with the sphinx_gallery where they said they'd prefer not to work on process-level isolation, but it's probably worth pinging again. While digging into this I noticed we had a slicing cast in AlterOpLayout due to a derived class of ObjectRef introducing virtuals. I moved the virtuals to the corresponding Node classes. In this case we got away with it since the ObjectRef happened to not get copied but we were on very thin ice. * [checkpoint] Woops, forgot there was an extra AlterOpLayout I should have run locally, there goes 6hrs of CI.
ylc
pushed a commit
to ylc/tvm
that referenced
this pull request
Jan 13, 2022
…_pass_infra.py (apache#9076) * BUG apache#8013: Remove register_alter_op_layout example from dev/use_pass_infra.py This tutorial registers a global layout transformation for conv2d for all targets which is not well-formed. Later uses of conv2d in the tutorials pick that layout up then assert fail in the conv2d type-relation. Better would be to register a transform for an entirely fake target, but that is beyond my current level of expertise. In general our use of sphinx/sphinx_gallery for running and rendering the tutorials is highly suspect since there is no inter-example isolation: - Examples using tensorflow will gobble up GPU memory and not give it back. - Any examples which use any of the (many!) global registration mechanisms need to ensure the registrant is safe across all tutorials. I recall seeing a thread with the sphinx_gallery where they said they'd prefer not to work on process-level isolation, but it's probably worth pinging again. While digging into this I noticed we had a slicing cast in AlterOpLayout due to a derived class of ObjectRef introducing virtuals. I moved the virtuals to the corresponding Node classes. In this case we got away with it since the ObjectRef happened to not get copied but we were on very thin ice. * [checkpoint] Woops, forgot there was an extra AlterOpLayout I should have run locally, there goes 6hrs of CI.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, the WhileNode assumes that evaluating the loop condition
will not introduce any additional labels. If this assumption is
violated, such as for a WhileNode whose condition is an if/else
statement, then the OpLoopMerge instruction appears in the wrong
block.
The unittest added exercises this code path, but doesn't yet trigger a
failure. Once spvValidate is enabled for all vulkan codegen, then
this unit test will catch the failure mode.