You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As much as I understand, the SASS instruction "EXIT" means the thread has finished the kernel execution.
In accel-sim tracer, we rely on this fact, so, when we reach an Exit instruction with a full active predicate mask, we terminate the execution and we assume the thread has finished exuection. This also matches the fact that there are no any instructions traced after this.
So For example:
PC Active&Predicate Inst
02b0 00000000 EXIT => This means, do not exit yet, as the active mask is zeros.
4250 ffffffff EXIT => This means, do exit, as the active mask is full. And, we notice no instructions are coming after this.
We have traced hundreds of workloads and our assumption looks correct and we did not face any issues. However, we traced
Nvidia cudf library for data analytics and we do find some weird scenario that does not match our assumption:
The traces output are:
Hello,
As much as I understand, the SASS instruction "EXIT" means the thread has finished the kernel execution.
In accel-sim tracer, we rely on this fact, so, when we reach an Exit instruction with a full active predicate mask, we terminate the execution and we assume the thread has finished exuection. This also matches the fact that there are no any instructions traced after this.
So For example:
We have traced hundreds of workloads and our assumption looks correct and we did not face any issues. However, we traced
Nvidia cudf library for data analytics and we do find some weird scenario that does not match our assumption:
The traces output are:
As you can see EXIT has an active mask and predicate with all ones, however, some instructions are coming after and the warp has not finished yet.
The CUDA kernel that is traced can be found here:
https://github.com/rapidsai/cudf/blob/c69b6f82adaa821c5201055ce3bd1672978b5704/cpp/src/io/parquet/page_data.cu#L1650
The Nvbit-based Accel-sim tracer takes into account the predicate mask as shown here:
https://github.com/accel-sim/accel-sim-framework/blob/4c2bf09a79d6b57bb10fe1898700930a5dd5531f/util/tracer_nvbit/tracer_tool/tracer_tool.cu#L529
Any help with this, please? Is our assumption about EXIT instruction correct?
Thanks!
The text was updated successfully, but these errors were encountered: