-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Bugfix] Fix __syncwarp on ROCM
#25996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok given this fix amd build. Without this op the accuracy will drop in nvidia
|
AMD gpu does not support __syncwarp. So to have AMD properly support this, the first thing to try would be __syncthreads, but it would be better to be done under the context of actual adding dsv32 support on AMD. So, this fix at the moment should be good. |
|
checking the failing CIs:
@simon-mo could you help force merge this PR? thanks! |
Signed-off-by: simon-mo <simon.mo@hey.com>
|
It is fine for temporary. |
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Purpose
We are seeing failure on AMD due to __syncwarp is exclusively on CUDA but not on HIP
Test Plan
CI