-
Notifications
You must be signed in to change notification settings - Fork 187
Use pragma to disable execution checks in cuda::proclaim_return_type. #448
Use pragma to disable execution checks in cuda::proclaim_return_type. #448
Conversation
Thanks a lot for the contribution 🎉 I am adding some cleanup because we want to centralize a macro in our |
There was a GitHub outage this morning. Some commits from @miscco are not showing in this PR. I am going to open this PR from draft state, since that fixed the problem for another PR where I saw this earlier today. |
3480817
to
b1deea2
Compare
@@ -394,6 +400,7 @@ __invoke(_Fp&& __f, _A0&& __a0) | |||
|
|||
// bullet 7 | |||
|
|||
_LIBCUDACXX_DISABLE_EXEC_CHECK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be aware, that this will disable execution checks for most of the library.
I am all in favor for that, but it will lead to bugs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about the potential for this problem. But I'm not sure what can be done about it... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unavoidable I'm afraid. This has been true in Thrust for time immemorial. It's nearly impossible to have generic C++ code without heavy use of this pragma.
…to fix problem with nested device lambdas.
Also cleanup the tests a bit
b1deea2
to
6912090
Compare
@bdice Thanks a lot for the contribution! I have rebased the PR on latest master to resolve some minor merge conflicts. |
@miscco Thank you very much! |
Use pragma suggested by @jrhemstad to disable execution checks in
cuda::proclaim_return_type
. This fixes a problem with nested device lambdas. Resolves #447.I'm not familiar with the layout of libcudacxx so I'm not sure what kind of test is appropriate to add here.
There is a minimal reproducer at https://godbolt.org/z/oaMhrcoPv, also noted in issue #447. I tested this locally with a more complex source file from libcudf (see here) and it compiled successfully after applying this patch.