-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CodeGen][CUDA] Enhance CUDA codegen for SelectNode #4983
Conversation
The test fails without this patch. It is also exposed by #4968, in which a simple kernel fails during the CUDA codegen // attr [iter_var(blockIdx.x, , blockIdx.x)] thread_extent = 98 |
7c48a26
to
290c58a
Compare
- This patch allows CUDA backend to emit correct code for selects with vector conditions, which may be produced by floordiv op lowering etc.. - This already works for llvm BE, as llvm select instruction supports vector conditions. Signed-off-by: Wei Pan <weip@nvidia.com>
Kindly ping. Can someone help review this PR? |
While I am not qualified to give a review of this, I have applied your changes on this PR and it I was able to compile a mxnet model to a cuda tvm graphruntime. autotvm also looks like it is working correctly. |
@wpan11nv, I think you have to tag some reviewers from here: |
@vinx13 Could you help review this PR? |
@jmorrill Thanks for confirming this fix! |
LGTM, but it seems indentation is broken in cuda source codegen. Not important, but would be nice to clean it up.
|
Yes, I noticed that indention issue too. I will have a look. Thanks! |
- This patch allows CUDA backend to emit correct code for selects with vector conditions, which may be produced by floordiv op lowering etc.. - This already works for llvm BE, as llvm select instruction supports vector conditions. Signed-off-by: Wei Pan <weip@nvidia.com>
- This patch allows CUDA backend to emit correct code for selects with vector conditions, which may be produced by floordiv op lowering etc.. - This already works for llvm BE, as llvm select instruction supports vector conditions. Signed-off-by: Wei Pan <weip@nvidia.com>
This patch allows CUDA backend to emit correct code for
selects with vector conditions, which may be produced
by floordiv op lowering etc..
This already works for llvm BE, as llvm select instruction
supports vector conditions.
Signed-off-by: Wei Pan weip@nvidia.com
Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.