Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The best practice for sparse matrix multiplication #6

Closed
JiaweiZhuang opened this issue Nov 3, 2017 · 5 comments
Closed

The best practice for sparse matrix multiplication #6

JiaweiZhuang opened this issue Nov 3, 2017 · 5 comments

Comments

@JiaweiZhuang
Copy link
Owner

JiaweiZhuang commented Nov 3, 2017

The information below is terribly outdated. See the latest comparisons at: https://github.com/JiaweiZhuang/sparse_dot


Since #2 is largely solved, I've done some preliminary tests on sparse matrix multiplication (SMM), using the sparse package. Just realized the basic scipy.sparse.coo_matrix.dot() is enough. I can flatten the input array to shape [Ntime*Nlev, Nlat*Nlon] so it can be passed to coo_matrix.dot().

The test input data has the shape (600, 400, 10, 50), which is then regridded to shape (400, 300, 10, 50). The timing results are

  • Calculating weights: ~14 s
    (This only needs to be done once, so I am not trying to optimize it)
    - Applying weights with ESMPy: 1.46 s
    (transferring numpy data to ESMPy internal takes additional 6.6s, due to converting C-ordered array to Fortran-ordered)
    - Applying weights with sparse.tensordot(): 579 ms

It is quite surprising that sparse.tensordot() is 3 times faster than ESMPy's internal Fortran routine.

  • Applying weights: (The previous comparison was not fair. I've removed all unnecessary array copying and re-ordering, and here's the new timing)
Method Applying weights Write output data to file
ESMPy 0.3 s [1] + 1 s [2] 2 s [3]
coo_matrix.dot 0.6 s 2 s
sparse.tensordot [4] 1 s 4 s
loop with numba 3 s 2 s
loop with pure numpy 6 s 2 s

[1] : Passing input data to Fortran internal. It also doubles memory use, which is not a problem in other methods.
[2] : Actual SMM.
[3] : Slow, due to the use of Docker. Would be ~0.3 s on native machine.
[4] : The slow-down compared to scipy is due to memory layout (C-ordering vs Fortran-ordering). See the notebook below.

The notebooks are available at

It might be useful to have both ESMPy and scipy as different backends. ESMPy results can be used as reference. At least one non-ESMPy method is needed since ESMPy cannot reuse offline weights right now.

@bekozi If you have time (no hurry), could you run my notebooks to see whether it is true that ESMPy's SMM is slower than Python's? I want to make sure the comparison was fair and I was not using ESMPy in a wrong way.


This thread is for discussing the best practice for SMM (i.e. applying weights).

Before developing high-level APIs I want to make sure I am using the most efficient way to perform SMM. Any suggestions will be appreciated. @mrocklin @pelson

@JiaweiZhuang
Copy link
Owner Author

JiaweiZhuang commented Nov 3, 2017

Because the regridding is so fast even for this relatively large data, I will postpone #3 and focus on serial implementation first.

@mrocklin
Copy link

It is quite surprising that sparse.tensordot() is 3 times faster than ESMPy's internal Fortran routine.

sparse.tensordot just calls scipy.sparse.csr_matrix.dot, which is written in C. It is not surprising to see it perform decently. There is some cost in converting from NDArray COO format to 2d CSR, but hopefully this is relatively small. It is also cached if used repeatedly.

@mrocklin
Copy link

Is it necessary for ESMPy to write outputs to file or can it pass results back as numpy arrays?

@JiaweiZhuang
Copy link
Owner Author

Is it necessary for ESMPy to write outputs to file or can it pass results back as numpy arrays?

Currently it has to write outputs to file. I don't know ESMPy internals so need to ask @bekozi

@JiaweiZhuang
Copy link
Owner Author

JiaweiZhuang commented Nov 30, 2017

According to the timing tests, scipy.sparse.coo_matrix.dot should be the most efficient method for applying weights.

nawendt pushed a commit to nawendt/esmpy-tutorial that referenced this issue Dec 7, 2017
This commit addresses a couple of issues. First, cartopy has been used
to plot the data on maps in lieu of basemap. This should be more
relevant to people going forward. I will keep around a branch that
contains the basemap version of the notebook, but not updates will
occur on that branch barring any unforeseen errors.

Secondly, per #4, it was pointed out that passing Fortran memory order
arrays instead of C memory order arrays is faster, see the issue from
JiaweiZhuang/xESMF#6 for further details. Care was taken to make sure
code logic is still intuitive.

Closes #3
Closes #4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants