-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
design discussions #32
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial thoughts on first read of today's rev --
- I left out copy-editing per se, except a couple places where I thought some wording was particularly important
- On your request, I limited the amount of positive, non-action-inducing feedback -- rest assured I'm smiling and fist-pumping multiple times as I read this, and please know my feedback is almost entirely positive
- Of course, I defer much to our biofriends who will know more
- TL;DR I love the concrete proposal for multimodal & I think it will work. Number-one thing needing ideation is how to get raw-handling up to that level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, otherwise looks good to me. I think this is good to start with, the first implementation may expose potential edits to the spec. Thanks @bkmartinjr!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
brainstorming.md
Outdated
| `coo` | Return an Arrow.SparseCOOTensor | | ||
| `csr` | Return an Arrow.SparseCSRTensor | | ||
| `csc` | Return an Arrow.SparseCSCTensor | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to drop in on this:
As I've mentioned before (#18 (comment)) I'm a little concerned about arrow's level of support for its Tensor types, especially the sparse tensor types. Last I looked into this, it seemed like there were no tests in the arrow ipc suite for these. E.g. there's no confirmation interchange works for tensors. I believe this has led to a number of implementations just skipping out on tensor (esp. sparse tensor) support.
That said, I would really love it if these were better supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear ya...that said, I am unsure there is any good (portable, common, etc) alternative other than removing the capability (ie, only supporting COO in a RecordBatch) or defining our own data structure. IMHO, both are worse than taking the risk of being an early SparseTensor adoptor.
If I'm missing an alternative, would love to hear your ideas!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only alternative I can think of is how vitessce handles sparse zarr AnnData matrices.
In effect, I'm not sure how different this is than defining a custom data structure, since clients will have to handle decoding.
My ideal here would be that there is some contact with the arrow team, there's some work on what they think about it, and some ownership of these structures goes to people who care about them. Tests could get added to the IPC, so implementation start supporting these (for example).
Following from SciPy this year, some discussions have started around binary formats for sparse data. Discussions are being recorded here: https://github.com/GraphBLAS/binsparse-specification and is largely w/ scientific-python and graphBLAS. Still quite early though.
…return value for ndarray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
PR to track discussion around some design discussions about base level primitives and typing