Open
Description
see #12
Issue may be that the thread max copy code was outside the if check on the thread
https://github.com/ObrienlabsDev/performance/blob/main/gpu/nvidia/cuda/cpp/128bit/collatz_cuda/kernel_collatz.cu#L100
if (threadIndex < threads) {
...
} while (!((current0 == 1ULL) && (current1 == 0ULL)));
// move max copy inside the thread if check (to avoid concurrency issues)
//
_output0[threadIndex] = max0;
_output1[threadIndex] = max1;
}
- _output0[threadIndex] = max0;
- _output1[threadIndex] = max1;
Metadata
Metadata
Assignees
Labels
No labels