-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add hack to handle tf not recognizing bool dtype in dlpack #276
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be worth adding a test for this conversion to demonstrate support for converting bool types
def _to_dlpack_from_tf_tensor(tensor): | ||
def _to_dlpack_from_cp_tensor(tensor): | ||
if tensor.dtype == cp.dtype("bool"): | ||
tensor = tensor.astype(cp.dtype("int8")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a corresponding coercion back to bool dtypes on the other side (the from dlpack functions?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we dont, we have seen in testing that the models seem to be ok with the conversion to 1 and 0 and they are able to interpret those values correctly. We were originally going to make that change also (it belongs in the tensor column constructor) but during testing we saw that is was not necessary. If it ever does become necessary we can absolutely add it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure these values are being interpreted as bools on the other side (vs being treated as int8
and having that still function for model training.) I'm not sure if it matters though, and I'd rather avoid an extra conversion if we can for performance reasons. So...I'm on the fence about adding the return conversion. "Let's see if it turns out to be a problem not to have it" seems reasonable, I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to specify a minimum dependency version for CuPy, and I'm not sure if we want to create a hard dependency on CUDA 11 since other toolkit versions also have CuPy 12.0.0 available. Seems like it might be better to specify the CuPy version we want as something like cupy>=12.0.0
@@ -3,4 +3,4 @@ | |||
# cudf>=21.12 | |||
# dask-cudf>=21.12 | |||
# dask-cuda>=21.12 | |||
# cupy>=7 | |||
cupy-cuda11x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the version does matter here, since CuPy didn't add support for bools with DLpack until 12.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you end up being forced to run the plain pip install cupy-cuda11x
and you get the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to install cupy-cuda11x
specifically though. In the containers we do, but here the requirement is actually to have cupy>=12.0.0
—which we will by installing cupy-cuda11x
, but would also be satisfied by cupy-cuda12x
, cupy-cuda102
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm wrong about that. pip install cupy>=12.0.0
still tries to install from source even if you have cupy-cuda*>=12.0.0
. Maybe we could make the various packages for satisfying the cupy dependency optional instead?
def _to_dlpack_from_tf_tensor(tensor): | ||
def _to_dlpack_from_cp_tensor(tensor): | ||
if tensor.dtype == cp.dtype("bool"): | ||
tensor = tensor.astype(cp.dtype("int8")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure these values are being interpreted as bools on the other side (vs being treated as int8
and having that still function for model training.) I'm not sure if it matters though, and I'd rather avoid an extra conversion if we can for performance reasons. So...I'm on the fence about adding the return conversion. "Let's see if it turns out to be a problem not to have it" seems reasonable, I suppose.
This PR adds a fix for tensorflow dlpack translation that does not allow bool dtypes. Because of this we have opted to cast all boolean columns as int8 dtype which is allowed by tensorflow. Also adds newest cupy version to the requirements for GPU version of core, needed to ensure proper implementation of dlpack is used in tensortable. For reference: https://github.com/tensorflow/tensorflow/blob/a9e6bd9ca4c0dbc96a71a1331bccf39a10757148/tensorflow/c/eager/dlpack.cc#L161-L242