Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: supporting BCSR matrices #125

Open
rohany opened this issue Feb 16, 2023 · 1 comment
Open

*: supporting BCSR matrices #125

rohany opened this issue Feb 16, 2023 · 1 comment

Comments

@rohany
Copy link
Member

rohany commented Feb 16, 2023

I looked into what would entail adding BCSR support to legate.sparse. I'll hold off on putting the actual work into it until we have requests from users that actually want to use the BCSR format and what functions they would like to see implemented. Some notes I have on doing this:

  • The BCSR format is just CSR, where each CSR entry is a contiguous block of non-zeros, similar to a DSDD format in TACO.
  • As a result, a similar approach to what we've done for CSR/CSC should work out. An annoying thing about this is that the way DISTAL would represent a DSDD tensor is different enough from how we would want to do this effectively in a SciPy implementation that we won't be able to take DISTAL kernels directly, but they should provide a reasonable skeleton for implementation.
  • In general, the implementation strategy will need to do a few things for declaring partition alignment using legate sparse. 1) input and output dense tensors will have to be reshaped (using store reshapes) based on the blocksize to have alignments against the blocked pos array. 2) images (on both pos and crd) need to have an affine transform applied to them that scales the values by different components of the blocksize Shape. Since this can't be done right now using Legion, we'll have to approximate this by making temporary copies of the stores.
  • Format conversions are unfortunately not implemented in pure python, so we'll have to hand-code some of these format conversion routines.
  • cuSparse has pretty well-rounded support for BCSR matrices that we can exploit. It'll be unclear what to do for operations that cuSparse doesn't support, since the size of the blocks significantly affects the strategy for GPU execution (even for CPU-parallel execution).

An example dot implementation on a BSR matrix might look something like:

def dot(self, x):
  y = cn.zeros(self.shape[0]).reshape(self.R, -1)
  x = x.reshape(self.C, -1)
  task = ctx.create_task(BCSR_SPMV)
  task.add_output(y)
  task.add_input(self.pos, self.crd, self.vals, x)
  promoted_pos = self.pos.promote(y.shape[1], 1)
  promoted_vals = self.vals.reshape(self.crd.shape[0], -1)
  task.add_alignment(promoted_pos, y, 0)
  task.add_image(promoted_pos, self.crd)
  task.add_image(promoted_pos, promoted_vals)
  task.add_image(self.crd, x)
  task.execute()

The tricky part here is handling the images onto these transformed stores, which we can't currently do natively in the image. One way of getting around this is to not transform the stores at all, but create temporary transformed version of the regions that take the correct image.

@rohany
Copy link
Member Author

rohany commented May 5, 2024

I realized that another way to do this instead of the nested tensor / big block region is to use fields of size = to the block size. A downside of this is the inability to use legion reductions though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant