Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Evaluate CUDA_CUB_RET_IF_FAIL macro argument only once #1264

Merged
merged 2 commits into from
Aug 28, 2020

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Aug 26, 2020

This updates the CUDA_CUB_RET_IF_FAIL macro to only evaluate its argument once. Without this, it can silently suppress CUDA errors in many places in the Thrust code that call it like this: CUDA_CUB_RET_IF_FAIL(cudaPeekAtLastError());.

The macro calls cub::Debug which in some versions of cub will unconditionally clear the CUDA error. Therefore when it evaluates cudaPeekAtLastError() the second time as the return argument, it will return cudaSuccess because the last error was just cleared. Thus any pending CUDA error before this macro was called with cudaPeekAtLastErrror() as the argument will be silently dropped.

Copy link
Collaborator

@alliepiper alliepiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, aside from a minor nitpick. Can you update this patch to remove the do/while no-op? I'll start our CI afterwards.

do { \
auto const error = (e); \
if (cub::Debug(error, __FILE__, __LINE__)) return error; \
} while(0);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: The do..while isn't necessary here, this can just be

{ \
    auto const error = (e);     \
    if (cub::Debug(error, __FILE__, __LINE__)) \
      return error; \
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allisonvacanti There are cases where using a scope block causes issues, please see https://stackoverflow.com/a/1067238

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that to work, you can't have the semicolon after the while (0). With that semicolon there, it has the same problem as if it were a block scope.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To update this thread with a slack convo, we don't want that behavior here since the old version didn't require a trailing semicolon. Using the do/while(0) pattern (without the semicolon) risks breaking other code.

@alliepiper
Copy link
Collaborator

Started CI under DVS CL 28995185.

@alliepiper alliepiper added the testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). label Aug 26, 2020
@alliepiper alliepiper self-assigned this Aug 26, 2020
@alliepiper
Copy link
Collaborator

One DVS builder failed after running out of disk space, but the rest look good.

Just waiting for my local testing script to finish up and then this should be good to go.

@alliepiper alliepiper added testing: internal ci passed Passed internal NVIDIA CI (DVS). testing: gpuCI passed Passed gpuCI testing. and removed testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). labels Aug 27, 2020
@alliepiper
Copy link
Collaborator

All set.

@alliepiper alliepiper merged commit a0948e3 into NVIDIA:main Aug 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
testing: gpuCI passed Passed gpuCI testing. testing: internal ci passed Passed internal NVIDIA CI (DVS).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants