-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Format][C++] Recommended/required value for ArrowDeviceArray.device_id int in case of CPU data #40801
Comments
Since the |
Perhaps we could add something like: "If the device type does not have a notion of device id, it is recommended to return 0 as a convention". |
Keeping aligned with dlpack and using 0 probably makes the most sense but there is some downsides. If we wanted to optionally denote which NUMA node CPU memory is on for example, being able to use numa node 0 would be good. Using -1 as the default would be more ideal in this situation. For managed memory, if that memory has been touched on a specific GPU then it would be local to that GPU and while it could be accessed by any device, it would be expensive to do so where it would be ideal to use an actual device id for that. If it hasn't been touched yet then again something like |
Given @kkraus14's comments, I think we'd be better served with recommending -1 then |
My understanding is that this is a doc clarification, should we add it for 16.0.0? |
I'll give it a go in the next few minutes! |
… C Device data interface (#41101) ### Rationale for this change It is not explicit what the value of the `ArrowDeviceArray::device_id` should be when a given device type has no notion of a device identifier (e.g., there is always only one). ### What changes are included in this PR? The text was clarified to recommend a value of -1. This was the value already used by Arrow C++. ### Are these changes tested? No tests needed (documentation) ### Are there any user-facing changes? No * GitHub Issue: #40801 Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
Issue resolved by pull request 41101 |
… C Device data interface (#41101) ### Rationale for this change It is not explicit what the value of the `ArrowDeviceArray::device_id` should be when a given device type has no notion of a device identifier (e.g., there is always only one). ### What changes are included in this PR? The text was clarified to recommend a value of -1. This was the value already used by Arrow C++. ### Are these changes tested? No tests needed (documentation) ### Are there any user-facing changes? No * GitHub Issue: #40801 Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
… Arrow C Device data interface (apache#41101) ### Rationale for this change It is not explicit what the value of the `ArrowDeviceArray::device_id` should be when a given device type has no notion of a device identifier (e.g., there is always only one). ### What changes are included in this PR? The text was clarified to recommend a value of -1. This was the value already used by Arrow C++. ### Are these changes tested? No tests needed (documentation) ### Are there any user-facing changes? No * GitHub Issue: apache#40801 Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
… Arrow C Device data interface (apache#41101) ### Rationale for this change It is not explicit what the value of the `ArrowDeviceArray::device_id` should be when a given device type has no notion of a device identifier (e.g., there is always only one). ### What changes are included in this PR? The text was clarified to recommend a value of -1. This was the value already used by Arrow C++. ### Are these changes tested? No tests needed (documentation) ### Are there any user-facing changes? No * GitHub Issue: apache#40801 Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
… Arrow C Device data interface (apache#41101) ### Rationale for this change It is not explicit what the value of the `ArrowDeviceArray::device_id` should be when a given device type has no notion of a device identifier (e.g., there is always only one). ### What changes are included in this PR? The text was clarified to recommend a value of -1. This was the value already used by Arrow C++. ### Are these changes tested? No tests needed (documentation) ### Are there any user-facing changes? No * GitHub Issue: apache#40801 Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
In #40717 (review), @paleolimbot noted that the current implementation of the C Device interface in Arrow C++ is using a
device_id
of -1 when exporting CPU data (while nanoarrow is using 0).The spec itself doesn't say anything about this (https://arrow.apache.org/docs/format/CDeviceDataInterface.html#c.ArrowDeviceArray.device_id):
Should we recommend or require a specific value for this case?
I noticed that the DLPack specification, from which the device type/id structure was taken, does specify to use 0 (https://dmlc.github.io/dlpack/latest/c_api.html#c.DLDevice.device_id):
Should we adopt the same guideline for our spec?
cc @pitrou @zeroshade @kkraus14
The text was updated successfully, but these errors were encountered: