cuda-memcheck --tool synccheck syncthreads
========= ERROR SUMMARY: 32 errors
32 invalid __syncthreads() calls were detected in the program (Barrier error detected. Divergent thread(s) in block
).
From:
// read values and increase if not already 2
if(arr[idx] != 2) {
local_array[threadIdx.x] = arr[idx] + 1;
__syncthreads();
} else {
local_array[threadIdx.x] = arr[idx];
__syncthreads();
}
To:
// read values and increase if not already 2
if(arr[idx] != 2) {
local_array[threadIdx.x] = arr[idx] + 1;
} else {
local_array[threadIdx.x] = arr[idx];
}
__syncthreads();
This change ensures that all threads in the block reach the __syncthreads() call, avoiding the barrier error.