-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TOPI] sparse_dense Op sparse_data input added #6889
Conversation
c3d8874
to
06f9911
Compare
I'm not sure that the correct course of action is to add a flag to Alternatively, I don't really see a reason for supporting AB^T with B sparse directly. Instead, when we convert from a frontend to tvm, we can just insert the correct transposes. In a lot of ways this is a better choice because we do not need to reimplement the same operators and the operators prefer the data to be in this format. I think this is the best choice. |
Thanks @tkonolige for response. I also had similar dilemma at the front. But later i resolved to re-utilize the existing Op, in order to enable reuse of small portion of code and it bind the concept of Op together too. But i am in favor of 'dense_sparse' Op as well(But that would be the name for existing Op i think 🙂 ). However i think we do not have much impact on the schedule side, in between these 2 flavor of Ops. But the utilizing Op with transpose, i felt we are adding additional overhead every-time may not be much in smaller matrix. |
I think it is still possible to reuse code with a separate op, but maybe its to a lesser extent. I'm not sure how much overhead the transposes will add to the code. We don't have to pay the cost of transposing the sparse matrix because that can be done at compile time (the sparse matrix is usually a weight). Could you maybe do some benchmarking and see if the duplication of code is worth it from a performance standpoint? |
Sure I will check on how much overhead added in case of transpose with existing Op case. |
06f9911
to
f7b8d98
Compare
Hi @tkonolige , below is the benchmark data i have obtained for 4 different input dimensions. NOTE: Here with Transpose means using existing sparse_dense Op with additional Transpose layer. Case 1{Sparse_input = [1024, 512], dense_input=[512, 4096]}: Case 2{Sparse_input = [1024, 512], dense_input=[512, 1024]}: Case 3{Sparse_input = [2048, 512], dense_input=[512, 2048]}: Case 4{Sparse_input = [4096, 512], dense_input=[512, 4096]}: Clearly as the dimension increases the amount of improvement is significant. So i think we should keep the new flavor of the Op implemented in current PR. |
Just to check, you're only transposing the dense matrix? Also, what is the density of the sparse matrix? I'm curious, could you do a benchmark with a smaller batch size (dense_input=[16, 1024])? And could you also test on cuda? And if you have time, could you give a breakdown of time spent in the matrix multiply vs time spent in the transpose? From these numbers, it does look like it is worthwhile to do this. It just means a lot more sparse kernels to write. |
Gentle ping @tkonolige !!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sold on this approach. But maybe someone else can chime in. If we do go with the sparse_data
flag, I think we should rename it to something like sparse_rhs
or sparse_b
. Or we could do the same thing as tf and have adjoint_a
and adjoint_b
.
Thanks @tkonolige for your feedback. I believe the performance stats are quite clear to opt for a new Op in the case. @tqchen : Tristan is helping here with his valuable review efforts. But i think we need a third opinion here (possibly an official reviewer or committer) to help proceed with the PR to next level. As i am not very sure who might be interested in Sparse changes, so if you can help tag few people here. TIA! |
Sorry, I meant that I'm not sure about the using flags approach vs having a separate operator for sparse x dense vs dense x sparse. From your benchmarks, it does look like we need some way of doing dense x sparse directly. |
Gentle ping @tqchen ! |
a466aed
to
4623979
Compare
@tkonolige : All your comments are addressed. Would you please check and approve. So that we can proceed with this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I've left a couple comments. I'm approving, but I'm not a committer, so you'll need someone else to approve too.
Could you also add a test to tests/python/topi/python/test_topi_sparse.py
with sparse_lhs=True
.
Sorry for such delayed response. Now i have addressed your remaining comments too! |
@tqchen , @jroesch , @FrozenGene, @junrushao1994 : Would you please help proceed with the PR. TIA! |
Just see your mention and sorry for later reply. Will do one round of review tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm. @antinucleon could you help to have one round of review? as you have done some work of sparse
Gentle ping @antinucleon , @FrozenGene , @tqchen , @jroesch ! |
@ANSHUMAN87 please go ahead |
Thanks a lot @FrozenGene, @tkonolige! |
* [TOPI] sparse_dense op sparse_data input added * [1] clang issue resolved * [2] python format resolved * [3] lint error resolved * [4] Review comments handled * [5] Lint error resolved * [6] Review comments handled * [7] Review comments handled * [8] Review comments handled
* [TOPI] sparse_dense op sparse_data input added * [1] clang issue resolved * [2] python format resolved * [3] lint error resolved * [4] Review comments handled * [5] Lint error resolved * [6] Review comments handled * [7] Review comments handled * [8] Review comments handled
* [TOPI] sparse_dense op sparse_data input added * [1] clang issue resolved * [2] python format resolved * [3] lint error resolved * [4] Review comments handled * [5] Lint error resolved * [6] Review comments handled * [7] Review comments handled * [8] Review comments handled
Change Summary:
Current sparse_dense Op in Topi support only weight as sparse input.
This PR add support for data also to be as sparse input.
So, in either case any one can be a sparse and other one will be dense.
This is a follow up PR from #6685
cc @tkonolige , @siju-samuel !