Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions after Exit termination #39

Closed
mkhairy opened this issue Mar 3, 2021 · 3 comments
Closed

Instructions after Exit termination #39

mkhairy opened this issue Mar 3, 2021 · 3 comments

Comments

@mkhairy
Copy link

mkhairy commented Mar 3, 2021

Hello,

As much as I understand, the SASS instruction "EXIT" means the thread has finished the kernel execution.
In accel-sim tracer, we rely on this fact, so, when we reach an Exit instruction with a full active predicate mask, we terminate the execution and we assume the thread has finished exuection. This also matches the fact that there are no any instructions traced after this.
So For example:

PC Active&Predicate Inst
02b0 00000000 EXIT => This means, do not exit yet, as the active mask is zeros.
4250 ffffffff EXIT => This means, do exit, as the active mask is full. And, we notice no instructions are coming after this.

We have traced hundreds of workloads and our assumption looks correct and we did not face any issues. However, we traced
Nvidia cudf library for data analytics and we do find some weird scenario that does not match our assumption:
The traces output are:

4230 ffffffff 0 ISETP.GE.AND 2 R5 R4 0 
4240 ffffffff 0 ISETP.GE.AND 2 R7 R2 0 
4250 ffffffff 0 EXIT 0 0 
4260 ffffffff 1 R20 IMAD.MOV.U32 2 R255 R255 0 
4270 ffffffff 0 ISETP.NE.AND 2 R31 R255 0 
4280 ffffffff 0 BSSY 0 0 
......

As you can see EXIT has an active mask and predicate with all ones, however, some instructions are coming after and the warp has not finished yet.

The CUDA kernel that is traced can be found here:
https://github.com/rapidsai/cudf/blob/c69b6f82adaa821c5201055ce3bd1672978b5704/cpp/src/io/parquet/page_data.cu#L1650

The Nvbit-based Accel-sim tracer takes into account the predicate mask as shown here:
https://github.com/accel-sim/accel-sim-framework/blob/4c2bf09a79d6b57bb10fe1898700930a5dd5531f/util/tracer_nvbit/tracer_tool/tracer_tool.cu#L529

Any help with this, please? Is our assumption about EXIT instruction correct?

Thanks!

@x-y-z
Copy link
Collaborator

x-y-z commented Mar 4, 2021

Are you sure you get the right active_mask and predicate_mask values of the instruction?

I see your code is using the deprecated __ballot [1].

[1] https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-vote-functions

@mkhairy
Copy link
Author

mkhairy commented Mar 4, 2021

Ok, Thanks! I got it. Let me fix this and see how it goes.

@mkhairy mkhairy closed this as completed Jun 14, 2021
@christindbose
Copy link

christindbose commented Nov 10, 2023

hi @x-y-z

exit_instr

I've facing the above similar looking error in one of my traces. Any idea what could cause it? I've checked that I'm using the new __ballot_sync

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants