-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPMM #14
Comments
Hi! We pass |
But if using |
I believe that is handled in the block configuration passed in for kernel launch. |
Should I use CudaSpmm directly to use the lib after I make install? code |
And if I don't use |
Yes, CudaSpmm is the right API. If you don't need a fused bias + relu you can call this API. If you want to fuse the operations we have CudaSpmmBiasRelu. |
Emmm I have a question. This project determines the configuration of spmmconfig based on the size of the input dense matrix, but this introduces runtime on the CPU. The time I spend directly using |
Interesting! I would think your problem must be quite small for that to be the case? The tuning heuristics in this library are by no means expected to be good across all problems and if you know what config is best for your problem you should pass that explicitly. |
When initializing the
sparse_tile_loader
,the threadIdx.x should be threadIdx.x%kBlockWidth. Is what I said correct ?The text was updated successfully, but these errors were encountered: