Skip to content

Matmul/generic #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed

Conversation

DavidMertz
Copy link

@DavidMertz DavidMertz commented May 3, 2017

Assuming we can merge the minimal https://github.com/mrocklin/sparse/pull/16 that simply implements the @ operator for known types, this is proposal 1 for behavior with lists/tuples.

The gist of this approach is that if a user want to coerce lists to arrays, they must either explicitly create COOs with that behavior as COO(..., toarray_other=True) or call sparse.dot(..., make_array=True) (or my_coo.dot(..., make_array=True)).

Test for both success and failure are included for these opt-in behaviors.

@mrocklin
Copy link
Contributor

mrocklin commented May 3, 2017

Just as an FYI I am currently -1 on these keyword arguments. My reasons include the following:

  • I think that tracking them throughout all operations will be add development cost
  • They're not part of the standard numpy API (which has somehow solved this problem with other means)
  • No active user has shown up with this need

Generally speaking if we're going to add something new to the array computing API I think that the conversation needs to start with the numpy community. I don't think that this project is the right place to expand the array computing user API. Perhaps raise an issue there?

@DavidMertz
Copy link
Author

Note that I have just submitted an alternative approach in https://github.com/mrocklin/sparse/pull/18. These two PRs/branches are mutually exclusive. The opt-in keywords were my idea to answer @mrocklin's concern about coercion happening implicitly.

@DavidMertz
Copy link
Author

Generally speaking if we're going to add something new to the array computing API I think that the conversation needs to start with the numpy community. I don't think that this project is the right place to expand the array computing user API. Perhaps raise an issue there?

There's not really the same issue for the NumPy community. NumPy happily coerces non-array collections to arrays implicitly and without asking (so does scipy.sparse, FWIW). The only reason we might not want that implicit behavior is because Dask sparse arrays might be too big to do this without excessive memory cost.

@jakevdp
Copy link
Contributor

jakevdp commented May 3, 2017

so does scipy.sparse, FWIW

As I've pointed out, scipy.sparse does coercion much more carefully. I think @mrocklin's complaint was not with coercion per se, but with the way you initially went about that coercion.

There are good solutions to this in numpy and scipy, and I think this version of the PR can be closed.

@DavidMertz
Copy link
Author

As I've pointed out, scipy.sparse does coercion much more carefully.

This really is not true. scipy.sparse does extra work to check for compatible shapes of things that are already arrays. But the entirety of the actual coercion code is the following:

# If it's a list or whatever, treat it like a matrix
other_a = np.asanyarray(other)

Currently sparse doesn't perform as much shape massaging as scipy.sparse, but the type coercion is exactly the same. In fact, since the type coercion comes after the shape massaging, if a user of scipy.sparse passes in a list-of-lists it won't benefit from the shape massaging, even if it could be made compatible by doing so.

But that's completely orthogonal to the type coercion step in this PR.

@jakevdp
Copy link
Contributor

jakevdp commented May 4, 2017

Yes, it uses np.asanyarray. But first it special-cases scalars and matrices, and afterward it double-checks that it hasn't just created a zero-dimensional object array. That's what I mean when I say it's more careful.

@jakevdp
Copy link
Contributor

jakevdp commented May 4, 2017

In any case, the main issue with the approach in this PR is not the specifics of coercion, but the additional keyword arguments. I would not be in favor of adding that level of additional API complexity.

@DavidMertz
Copy link
Author

DavidMertz commented May 4, 2017

@jakevdp Are you OK with #18 instead. I.e. "always coerce if the object is specifically a list or tuple". In that approach, the test in scipy will never be fulfilled, e.g.:

 if other_a.ndim == 0 and other_a.dtype == np.object_:
     # Not interpretable as an array; return NotImplemented so that
     # other's __rmul__ can kick in if that's implemented.
     return NotImplemented

That's not to say we couldn't fail in other ways with #18. But not in ways where scipy.sparse succeeds. E.g.:

In [1]: import numpy as np
In [2]: from sparse import COO
In [3]: l = [1, 2, 3, 4, 5]
In [4]: a = np.array(l)
In [5]: sa = COO(a)
In [6]: l @ sa
Out[6]: array(55, dtype=int64)
In [7]: ['a','b','c','d','e'] @ sa
[...]
TypeError: no supported conversion for types: (dtype('int64'), dtype('<U'))
In [8]: [6,7,8,9] @ sa
[...]
ValueError: shape-mismatch for sum

@jakevdp
Copy link
Contributor

jakevdp commented May 4, 2017

Yes, I'm mostly fine with #18 – it's a good start and we can expand from there to work for more general inputs.

@DavidMertz DavidMertz mentioned this pull request May 4, 2017
@mrocklin
Copy link
Contributor

OK to close this?

@DavidMertz
Copy link
Author

Yes, but see my note in #18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants