add hack to handle tf not recognizing bool dtype in dlpack #276

jperez999 · 2023-04-07T01:12:30Z

This PR adds a fix for tensorflow dlpack translation that does not allow bool dtypes. Because of this we have opted to cast all boolean columns as int8 dtype which is allowed by tensorflow. Also adds newest cupy version to the requirements for GPU version of core, needed to ensure proper implementation of dlpack is used in tensortable. For reference: https://github.com/tensorflow/tensorflow/blob/a9e6bd9ca4c0dbc96a71a1331bccf39a10757148/tensorflow/c/eager/dlpack.cc#L161-L242

oliverholworthy

May be worth adding a test for this conversion to demonstrate support for converting bool types

oliverholworthy · 2023-04-07T11:50:31Z

merlin/table/cupy_column.py

-    def _to_dlpack_from_tf_tensor(tensor):
+    def _to_dlpack_from_cp_tensor(tensor):
+        if tensor.dtype == cp.dtype("bool"):
+            tensor = tensor.astype(cp.dtype("int8"))


Do we need a corresponding coercion back to bool dtypes on the other side (the from dlpack functions?)

No we dont, we have seen in testing that the models seem to be ok with the conversion to 1 and 0 and they are able to interpret those values correctly. We were originally going to make that change also (it belongs in the tensor column constructor) but during testing we saw that is was not necessary. If it ever does become necessary we can absolutely add it in.

I'm not 100% sure these values are being interpreted as bools on the other side (vs being treated as int8 and having that still function for model training.) I'm not sure if it matters though, and I'd rather avoid an extra conversion if we can for performance reasons. So...I'm on the fence about adding the return conversion. "Let's see if it turns out to be a problem not to have it" seems reasonable, I suppose.

karlhigley

I think we need to specify a minimum dependency version for CuPy, and I'm not sure if we want to create a hard dependency on CUDA 11 since other toolkit versions also have CuPy 12.0.0 available. Seems like it might be better to specify the CuPy version we want as something like cupy>=12.0.0

karlhigley · 2023-04-07T15:06:17Z

requirements-gpu.txt

@@ -3,4 +3,4 @@
 # cudf>=21.12
 # dask-cudf>=21.12
 # dask-cuda>=21.12
-# cupy>=7
+cupy-cuda11x


I think the version does matter here, since CuPy didn't add support for bools with DLpack until 12.0.0

We cant, when you try to specify that here it fails... When I try something like `pip install cupy-cuda11x>=12.0.0 it silently fails and nothing gets installed. This is after I have previously uninstalled the version in the environment. This is why I just left it to grab the latest version.

So you end up being forced to run the plain pip install cupy-cuda11x and you get the latest version.

I don't think we want to install cupy-cuda11x specifically though. In the containers we do, but here the requirement is actually to have cupy>=12.0.0—which we will by installing cupy-cuda11x, but would also be satisfied by cupy-cuda12x, cupy-cuda102, etc.

Okay, I'm wrong about that. pip install cupy>=12.0.0 still tries to install from source even if you have cupy-cuda*>=12.0.0. Maybe we could make the various packages for satisfying the cupy dependency optional instead?

karlhigley · 2023-04-07T15:09:09Z

merlin/table/cupy_column.py

-    def _to_dlpack_from_tf_tensor(tensor):
+    def _to_dlpack_from_cp_tensor(tensor):
+        if tensor.dtype == cp.dtype("bool"):
+            tensor = tensor.astype(cp.dtype("int8"))


I'm not 100% sure these values are being interpreted as bools on the other side (vs being treated as int8 and having that still function for model training.) I'm not sure if it matters though, and I'd rather avoid an extra conversion if we can for performance reasons. So...I'm on the fence about adding the return conversion. "Let's see if it turns out to be a problem not to have it" seems reasonable, I suppose.

add hack to handle tf not recognizing bool dtype in dlpack

b346709

jperez999 self-assigned this Apr 7, 2023

jperez999 requested review from karlhigley and oliverholworthy and removed request for karlhigley April 7, 2023 01:13

jperez999 added bug Something isn't working clean up ci chore Maintenance for the repository labels Apr 7, 2023

jperez999 added this to the Merlin 23.04 milestone Apr 7, 2023

oliverholworthy reviewed Apr 7, 2023

View reviewed changes

jperez999 requested a review from karlhigley April 7, 2023 14:38

karlhigley suggested changes Apr 7, 2023

View reviewed changes

karlhigley mentioned this pull request Apr 7, 2023

use merlin compat for imports of gpu specific packages NVIDIA-Merlin/NVTabular#1791

Merged

karlhigley removed ci clean up chore Maintenance for the repository labels Apr 7, 2023

karlhigley self-requested a review April 7, 2023 18:35

karlhigley approved these changes Apr 7, 2023

View reviewed changes

karlhigley merged commit fce0366 into NVIDIA-Merlin:main Apr 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add hack to handle tf not recognizing bool dtype in dlpack #276

add hack to handle tf not recognizing bool dtype in dlpack #276

jperez999 commented Apr 7, 2023 •

edited

Loading

oliverholworthy left a comment

oliverholworthy Apr 7, 2023

jperez999 Apr 7, 2023 •

edited

Loading

karlhigley Apr 7, 2023

karlhigley left a comment

karlhigley Apr 7, 2023

jperez999 Apr 7, 2023

jperez999 Apr 7, 2023

karlhigley Apr 7, 2023

karlhigley Apr 7, 2023

karlhigley Apr 7, 2023

add hack to handle tf not recognizing bool dtype in dlpack #276

add hack to handle tf not recognizing bool dtype in dlpack #276

Conversation

jperez999 commented Apr 7, 2023 • edited Loading

oliverholworthy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jperez999 Apr 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlhigley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jperez999 commented Apr 7, 2023 •

edited

Loading

jperez999 Apr 7, 2023 •

edited

Loading