-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a note on alignment requirement of data pointer #83
Conversation
xref discussion in data-apis/array-api#293 Code examples for current handling of the `data` and `byte_offset` fields: https://github.com/apache/tvm/blob/43c2ea72bcd9968531441189e172372c1d8a64cb/src/contrib/tf_op/tvm_dso_op_kernels.cc#L170 https://github.com/tensorflow/tensorflow/blob/0b6b491d21d6a4eb5fbab1cca565bc1e94ca9543/tensorflow/compiler/xla/python/dlpack.cc#L308 https://github.com/cupy/cupy/blob/1a1ac3a30930562580c61fb5f9ec44314c6a593b/cupy/_core/dlpack.pyx#L111 https://github.com/pytorch/pytorch/blob/4b9464f4b99750c31583e5e384bec4de927f588e/aten/src/ATen/DLConvertor.cpp#L236
A few questions:
|
There is no alignment expectation of The main original goal was to de-couple so that in common cases a check in byte_offset should be sufficient for aligned arrays(as a result default optimziations that folllows) |
Right, although that pointer may not be accessible, so the importer cannot assume they can read the data "between"
Should NumPy do that, or must NumPy do that? (Assuming we live in a future where the added note does not apply anymore)? It is still not clear to me.
Are you say that the importer must check if the data is aligned (no alignment guarantee at all) but may do so by checking only One thing maybe: I would be perfectly happy if alignment guarantees differ for devices if that helps. |
So are these right!?
Comment: Currently there are a lot of buggy implementations, so it may make sense to not rely too much on the |
And sorry for being knee-jerk at thinking that adding examples of things that do not adher to the standard is not all that useful... Thanks, this is much clearer now. However, I still feel it requires the reader to fill in the blanks. And to a reader interested in importing it may not even occur that unaligned data is a thing or that |
I think this is always under control of the library, right? If initial allocation is aligned, then one needs to deal with unaligned data when new arrays are created as views. In that case, calculating the data pointer as
It's must, that is what
It sounds right to me. |
My 2c here: having the What benefit does this give us over just allowing |
I think it is useful, although it's a bit awkward to put it inside the header - the current comment is already too long. I'd like to write some html docs that have:
|
Weren't you the one arguing in favor of having an aligned data pointer in data-apis/array-api#293 (comment) @kkraus14? I'm confused now. It's either this, or just giving up and having zero alignment guarantees I think. |
Yes, but
Yes, OK. My frustration is that I want/am expected to merge dlpack into NumPy. That means I have to either:
I am quite certain we should just break pytorch, but until now nobody thought of enforcing the 256 byte aligned
yes, it is super useful! But it does not help me with making a call on whether to break the standard or break pytorch (or importers not expecting unaligned data).
Agreed with @kkraus14, I do not understand the advantage, but I am happy to accept it :). |
No, I think we should do this: data-apis/array-api#293 (comment). We can get to a situation where everyone adheres to the standard without breaking anything for anyone (unless they use really old versions of a library). No matter how this discussion ends, it's about a future state - we can and should merge the NumPy PR with |
I was arguing in favor of having aligned allocations where it's then safe to read from |
@rgommers |
Yes, you're not mistaken. As I said in the linked comment above, that is why we need gradual evolution. Start with |
Sorry, but IMO, if the current state is that NumPy must export with The comment tells me: Please add extra checks on import. But not breaking |
OK, so: The way not adhere to the standard is just to produce I can live with that. I assume this also means NumPy should be happy to export unaligned data? |
It sounds like you are asking for optional alignment then. There's no place to put such information. I think it may be better to move that conversation back to data-apis/array-api#293, let me comment there. |
OK, NumPy related only, but I pushed this then:
|
thank you, that seems good to me. |
xref discussion in data-apis/array-api#293
Code examples for current handling of the
data
andbyte_offset
fields:
https://github.com/apache/tvm/blob/43c2ea72bcd9968531441189e172372c1d8a64cb/src/contrib/tf_op/tvm_dso_op_kernels.cc#L170
https://github.com/tensorflow/tensorflow/blob/0b6b491d21d6a4eb5fbab1cca565bc1e94ca9543/tensorflow/compiler/xla/python/dlpack.cc#L308
https://github.com/cupy/cupy/blob/1a1ac3a30930562580c61fb5f9ec44314c6a593b/cupy/_core/dlpack.pyx#L111
https://github.com/pytorch/pytorch/blob/4b9464f4b99750c31583e5e384bec4de927f588e/aten/src/ATen/DLConvertor.cpp#L236
Cc @leofang, @kkraus14, @seberg, @tqchen, @mattip