-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDA][PASS]Legalize tensorcore #7147
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! I like the idea of padding to make other shapes work on tensorcore, but I think its important that you add some tests to confirm this features works as expected before we consider merging.
Conflicts: python/tvm/topi/nn/batch_matmul.py python/tvm/topi/nn/dense.py
7bf253c
to
25dac28
Compare
return entry if isinstance(expr, relay.Function) else entry.body | ||
|
||
|
||
def test_legalize_conv2d(data_shape, kernel_shape, pad_shape, do_pad=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reference test_legalize_pass.py
for the CI issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
@Laurawly Thanks! The ci passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry for my late reply. This looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the tests, this LGTM now.
* add pad_to_tensorcore & legalize for dense/bmm/conv2d * fix pad & slice * fix comments * fix comments * resolve conflict * resolve conflict * support only fp16 * add tests/python/relay/test_pass_legalize_tensorcore.py * add tests for legalize tensorcore * fix pylint * fix pylint * code format * use_gpu test only; fix conv2d_alter_op * fix tests params * revert transform fix
* add pad_to_tensorcore & legalize for dense/bmm/conv2d * fix pad & slice * fix comments * fix comments * resolve conflict * resolve conflict * support only fp16 * add tests/python/relay/test_pass_legalize_tensorcore.py * add tests for legalize tensorcore * fix pylint * fix pylint * code format * use_gpu test only; fix conv2d_alter_op * fix tests params * revert transform fix
* add pad_to_tensorcore & legalize for dense/bmm/conv2d * fix pad & slice * fix comments * fix comments * resolve conflict * resolve conflict * support only fp16 * add tests/python/relay/test_pass_legalize_tensorcore.py * add tests for legalize tensorcore * fix pylint * fix pylint * code format * use_gpu test only; fix conv2d_alter_op * fix tests params * revert transform fix
* add pad_to_tensorcore & legalize for dense/bmm/conv2d * fix pad & slice * fix comments * fix comments * resolve conflict * resolve conflict * support only fp16 * add tests/python/relay/test_pass_legalize_tensorcore.py * add tests for legalize tensorcore * fix pylint * fix pylint * code format * use_gpu test only; fix conv2d_alter_op * fix tests params * revert transform fix
Add legalize pass: padding dense/conv2d/batch_matmul ops to legal shapes for using tensorcore on cuda target. To limit the overhead introduced by padding, we count the
extra_flops
and set the threshold to 2x, which is conservative compared to the speedup of tensorcore.This pr is dependent on #7146 .
@jcf94 @merrymercy could you also help review this pr?