-
Notifications
You must be signed in to change notification settings - Fork 124
[HIP] Fix host/device synchronization #1001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
For sanity check, do you mind running the SuccessSynchronizedTime test in a loop 1000 times or so? On CUDA (at least on my machine), this test was unstable (sometimes passes and sometimes fails with a large variance of the result). I struggle to understand how moving the hostTimestamp code solves the issue. Maybe it is the fact that cuEventSynchronize is called right away after cuEventRecord? |
Yep you're right this test does fail sometimes after multiple repeated calls, so I have made it optional again. However, the reason why we take the host time last, is that to fetch the device time takes for time that for host time. If we get the host-time first, then we have to wait for hip to sync the streams and calculate the time, by which time the host clock is already old. |
Yeah, I think it makes sense for the time to be a bit more synchronized this way. It just didn't completely justify the variance that I saw on CUDA. I think there is something else going on that makes this entrypoint unreliable. My guess is that it's related to the operating system. Anyway, this looks good to me now 👍 |
|
Note that for CUDA we moved the |
Thanks @npmiller - I won't make this change as part of this PR, but I've created #1007 to make the change |
|
@oneapi-src/unified-runtime-hip-write can you please review? |
|
Covered in #1030 |
This PR fixes the entry-point
urDeviceGetGlobalTimestamp:EvBaseevent when getting platformsllvm-testing: intel/llvm#11669