-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix triangular solvers on Windows CUDA #1665
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether jacboi using any type whose size is not divisible by 4 for shared memory may lead an issue here. But Jacobi seems not to use it.
acfe081
to
5fb42a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to have the is_nan_exact
function moved to the TRS kernel file to avoid future misuse.
return std::isnan(value); | ||
using std::isnan; | ||
return isnan(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is std::isnan(value)
different from the updated one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but this has a better chance of using the CUDA isnan
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately it still doesn't work with MSVC, which means I'll have to file a bug report some time ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for applying my suggestions.
I only have a minor nit regarding the documentation; the rest looks good!
return std::isnan(value); | ||
using std::isnan; | ||
return isnan(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but this has a better chance of using the CUDA isnan
function.
MSVC doesn't treat the is_nan properly, so we do a byte comparison instead
- move is_nan_exact to internal code - rename to float_to_uint_impl - increase tolerance for additional test
For things to work in Windows, the shared libraries need to be in the working directory or PATH
Co-authored-by: Thomas Grützmacher <thomas.gruetzmacher@tum.de>
4734f20
to
95fe3f4
Compare
The alignment of uninitialized_array was currently only 1 based on
unsigned char
, and the whole setup screams undefined behavior to me, so this makes it much safer for our current use cases (since casting between complex and real types is well-defined).Additionally, MSVC doesn't properly find
isnan
as a device function, so we need to add our own implementation.