You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When writing the tracer tool for NVBit, if the instrumentation location is set to IPOINT_AFTER, the reported result will miss the last instruction in every warp.
# Using provided vecadd program# Using `IPOINT_BEFORE`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98, kernel instructions 50077, total instructions 50077
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 50077
# Using `IPOINT_AFTER`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98, kernel instructions 46941, total instructions 46941
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 46941
The vecadd program add 100000 elements with block size of 1024, which creates $98 * 1024/32 = 3136$ warps, exactly the difference between the two runs. When using accel-sim tracing tool, it shows that the missing instruction is the last instruction for every warp.
How to recreate
Modify the instr_count tool to instruments at IPOINT_AFTER instead of IPOINT_BEFORE.
Test environment
CUDA: 11.0
CUDA Driver: 530.41.03
GCC: 7.5.0
OS: Ubuntu 18.04.6 LTS
The text was updated successfully, but these errors were encountered:
Problem
When writing the tracer tool for NVBit, if the instrumentation location is set to
IPOINT_AFTER
, the reported result will miss the last instruction in every warp.The vecadd program add 100000 elements with block size of 1024, which creates$98 * 1024/32 = 3136$ warps, exactly the difference between the two runs. When using accel-sim tracing tool, it shows that the missing instruction is the last instruction for every warp.
How to recreate
Modify the
instr_count
tool to instruments atIPOINT_AFTER
instead ofIPOINT_BEFORE
.Test environment
The text was updated successfully, but these errors were encountered: