-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround to make conv2d_transpose compilation for CUDA work #4472
Conversation
@@ -186,7 +186,9 @@ def _callback(op): | |||
|
|||
if cfg.is_fallback: | |||
N, F, Y, X = get_const_tuple(conv.shape) | |||
_fallback_schedule(N, F, Y, X) | |||
# Workaround to make CUDA compilation work. Issue #4470 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still use the fallback for the other cases by checking the input params here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked more kernel and strides combinations and found that the error happens when kernel is equal to strides, e.g.
# kernel and strides when compilation for CUDA fails
2x2 and (2,2)
3x3 and (3,3)
4x4 and (4,4)
5x5 and (5,5)
2x3 and (2,3)
3x2 and (3,2)
1x2 and (1x2)
etc
I also found that the compilation fails if output channel is 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added kernel / strides check and skip _fallback_schedule
when output channel is 1.
In other case It will run _fallback_schedule
for kernel 1x1 or when kernel != strides
f339b26
to
89056d4
Compare
89056d4
to
10f4b18
Compare
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue apache#4472
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue apache#4472
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue #4472
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue apache#4472
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue apache#4472
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue apache#4472
Workaround for issue #4470
Discussions:
https://discuss.tvm.ai/t/conv2d-transpose-kernel-2x2-strides-2-2-fails-for-cuda-cannot-prove/5020
https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164