Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last instruction not instrumented when setting instrumenting position to be IPOINT_AFTER #125

Open
William-An opened this issue Jan 30, 2024 · 1 comment

Comments

@William-An
Copy link

Problem

When writing the tracer tool for NVBit, if the instrumentation location is set to IPOINT_AFTER, the reported result will miss the last instruction in every warp.

# Using provided vecadd program
# Using `IPOINT_BEFORE`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98,  kernel instructions 50077, total instructions 50077
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 50077

# Using `IPOINT_AFTER`
kernel 0 - _Z6vecAddPdS_S_i - #thread-blocks 98,  kernel instructions 46941, total instructions 46941
Final sum = 100000.000000; sum/n = 1.000000 (should be ~1)
Total app instructions: 46941

The vecadd program add 100000 elements with block size of 1024, which creates $98 * 1024/32 = 3136$ warps, exactly the difference between the two runs. When using accel-sim tracing tool, it shows that the missing instruction is the last instruction for every warp.

How to recreate

Modify the instr_count tool to instruments at IPOINT_AFTER instead of IPOINT_BEFORE.

Test environment

  1. CUDA: 11.0
  2. CUDA Driver: 530.41.03
  3. GCC: 7.5.0
  4. OS: Ubuntu 18.04.6 LTS
@William-An
Copy link
Author

Looks like taken BRA instruction is also not handled properly when instrumentation location is set to IPOINT_AFTER.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant