-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement circshift and circshift! for SparseMatrixCSC #30300
Conversation
Do you have any benchmarks to share? |
function circshift!(O::SparseMatrixCSC, X::SparseMatrixCSC, (r,c)::Base.DimsInteger{2}) | ||
I, J, V = findnz(X) | ||
O .= sparse(mod1.(I .+ r, X.n), mod1.(J .+ c, X.m), V, X.n, X.m) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor style thing: I'd like to see return O
here.
This is great, thank you! Would you be willing to also implement I'm sure benchmarks are impressive. |
Sure! |
|
||
function circshift!(O::SparseMatrixCSC, X::SparseMatrixCSC, (r,c)::Base.DimsInteger{2}) | ||
I, J, V = findnz(X) | ||
O .= sparse(mod1.(I .+ r, X.n), mod1.(J .+ c, X.m), V, X.n, X.m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this X.m, X.n
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I've been sloppy and didn't notice because I used square matrices. I modified the code and tests to catch this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Make sure the non-square case has test coverage.)
The approach in this PR allocates 9 temporary arrays of length O(nnz), so it is wasting a factor of of 4–5 in memory. (3 for the It would be much more efficient to operate in-place directly on the CSC output, so that no temporary O(nnz) arrays are required. Even if It also kind of defeats the purpose of |
Here are some quick benchmarks. I'm using
First an NxN matrix with N=10:
For larger size:
For a large, but sparser matrix:
So the dense version does not allocate, but still the sparse version seems more convenient for large matrices and gets better with sparser matrices. I think that one can get rid of the allocations but the code would not be so simple. I can try that if you think that its important. |
Let's not let perfect be the enemy of the good here. Sure, this could be improved further, but:
Sure, the multiplier on the O(nnz) is a little high, and this might actually be a slight perf regression for completely full SparseMatrixCSCs, but it's a huge improvement for the most common use-cases for sparse matrices. |
You're right... I will see how it comes and eventually make another PR. |
done in #30317 |
6e22a8b
to
6f890ac
Compare
Bump. Is this good to merge? |
Should this be closed in favour of #30317? |
I suppose that this question is not directed to me. In my opinion #30317 is uniformly better than master performance-wise, but of course, it's harder to review. This PR is a bit slower than master for small dense matrices. If there is something I can do to help the review process of #30317, please let me know! |
This implements circshift and circshift! for the SparseMatrixCSC type (otherwise it falls back to the generic, suboptimal method). I added a couple of tests.
https://discourse.julialang.org/t/circshift-of-a-sparsematrix/18420