Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39769: [C++][Device] Fix Importing nested and string types for DeviceArray #39770

Merged
merged 3 commits into from
Feb 5, 2024

Conversation

zeroshade
Copy link
Member

@zeroshade zeroshade commented Jan 23, 2024

Rationale for this change

In my testing with libcudf and other GPU data, I discovered a deficiency in ImportDeviceArray and thus ImportDeviceRecordBatch where the device type and memory manager aren't propagated to child importers and it fails to import offset-based types such as strings.

What changes are included in this PR?

These are relatively easily handled by first ensuring that ImportChild propagates the device_type and memory manager from the parent. Then for importing offset based values we merely need to use the memory manager to copy the final offset value to the CPU to use for the buffer size computation.

This will work for any device which has implemented CopyBufferTo/From

Are these changes tested?

A new test is added to test these situations.

@zeroshade zeroshade requested a review from pitrou January 23, 2024 19:16
Copy link

⚠️ GitHub issue #39769 has been automatically assigned in GitHub to PR creator.

cpp/src/arrow/c/bridge_test.cc Show resolved Hide resolved
cpp/src/arrow/device.cc Show resolved Hide resolved
cpp/src/arrow/device.cc Outdated Show resolved Hide resolved
cpp/src/arrow/device.cc Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jan 24, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jan 24, 2024
@zeroshade zeroshade requested a review from pitrou January 29, 2024 18:33
@pitrou
Copy link
Member

pitrou commented Jan 30, 2024

Do you have plans to add CUDA-based tests?

@zeroshade
Copy link
Member Author

@pitrou I'll add explicit CUDA based tests for the device array interface as a separate PR for #39786 (comment)

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels Feb 2, 2024
@zeroshade zeroshade merged commit 26801f1 into apache:main Feb 5, 2024
42 of 45 checks passed
@zeroshade zeroshade removed the awaiting merge Awaiting merge label Feb 5, 2024
@zeroshade zeroshade deleted the import-device-array-fixes branch February 5, 2024 20:29
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 26801f1.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 6 possible false positives for unstable benchmarks that are known to sometimes produce them.

dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…or DeviceArray (apache#39770)

### Rationale for this change
In my testing with libcudf and other GPU data, I discovered a deficiency in ImportDeviceArray and thus ImportDeviceRecordBatch where the device type and memory manager aren't propagated to child importers and it fails to import offset-based types such as strings.

### What changes are included in this PR?
These are relatively easily handled by first ensuring that `ImportChild` propagates the device_type and memory manager from the parent. Then for importing offset based values we merely need to use the memory manager to copy the final offset value to the CPU to use for the buffer size computation. 

This will work for any device which has implemented CopyBufferTo/From 

### Are these changes tested?
A new test is added to test these situations.

* Closes: apache#39769

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Feb 28, 2024
…or DeviceArray (apache#39770)

### Rationale for this change
In my testing with libcudf and other GPU data, I discovered a deficiency in ImportDeviceArray and thus ImportDeviceRecordBatch where the device type and memory manager aren't propagated to child importers and it fails to import offset-based types such as strings.

### What changes are included in this PR?
These are relatively easily handled by first ensuring that `ImportChild` propagates the device_type and memory manager from the parent. Then for importing offset based values we merely need to use the memory manager to copy the final offset value to the CPU to use for the buffer size computation. 

This will work for any device which has implemented CopyBufferTo/From 

### Are these changes tested?
A new test is added to test these situations.

* Closes: apache#39769

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
thisisnic pushed a commit to thisisnic/arrow that referenced this pull request Mar 8, 2024
…or DeviceArray (apache#39770)

### Rationale for this change
In my testing with libcudf and other GPU data, I discovered a deficiency in ImportDeviceArray and thus ImportDeviceRecordBatch where the device type and memory manager aren't propagated to child importers and it fails to import offset-based types such as strings.

### What changes are included in this PR?
These are relatively easily handled by first ensuring that `ImportChild` propagates the device_type and memory manager from the parent. Then for importing offset based values we merely need to use the memory manager to copy the final offset value to the CPU to use for the buffer size computation. 

This will work for any device which has implemented CopyBufferTo/From 

### Are these changes tested?
A new test is added to test these situations.

* Closes: apache#39769

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
pitrou pushed a commit that referenced this pull request May 23, 2024
#41477)

### Rationale for this change
Currently ```MemoryManager``` objects define functionality to Copy or View entire buffers. Occasionally there is the need to only copy a single value or slice from a buffer to a piece of CPU memory (see #39770 (comment)). It's overkill to do a bunch of whole Buffer operations and manually slicing just to copy 4 or 8 bytes.

### What changes are included in this PR?
Add the ```MemoryManager::CopyBufferSliceToCPU``` function, which initially attempts to use memcpy for the specified slice. If this is not possible, it defaults to copying the entire buffer and then viewing/copying the slice.

Update ```ArrayImporter::ImportStringValuesBuffer``` to use this function.

### Are these changes tested?

```ArrayImporter::ImportStringValuesBuffer```  is tested as a part of  ```arrow-c-bridge-test```
* GitHub Issue: #39858

Lead-authored-by: Alan Stoate <alan.stoate@gmail.com>
Co-authored-by: Mac Lilly <maclilly45@gmail.com>
Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…pointer (apache#41477)

### Rationale for this change
Currently ```MemoryManager``` objects define functionality to Copy or View entire buffers. Occasionally there is the need to only copy a single value or slice from a buffer to a piece of CPU memory (see apache#39770 (comment)). It's overkill to do a bunch of whole Buffer operations and manually slicing just to copy 4 or 8 bytes.

### What changes are included in this PR?
Add the ```MemoryManager::CopyBufferSliceToCPU``` function, which initially attempts to use memcpy for the specified slice. If this is not possible, it defaults to copying the entire buffer and then viewing/copying the slice.

Update ```ArrayImporter::ImportStringValuesBuffer``` to use this function.

### Are these changes tested?

```ArrayImporter::ImportStringValuesBuffer```  is tested as a part of  ```arrow-c-bridge-test```
* GitHub Issue: apache#39858

Lead-authored-by: Alan Stoate <alan.stoate@gmail.com>
Co-authored-by: Mac Lilly <maclilly45@gmail.com>
Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++][Device] ImportDeviceArray fails on strings and nested types
4 participants