-
Notifications
You must be signed in to change notification settings - Fork 124
[DeviceSanitizer] Fix interceptor destruction order #1879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DeviceSanitizer] Fix interceptor destruction order #1879
Conversation
pbalcer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit us a couple of times now. Can you please add some tests in UR to verify that the sanitizer at least initializes, parses options and destroys correctly.
Our CI infra now has PVC systems (the label is "L0_E2E"), so you could even add tests for the core functionality, but for now it would be good to cover just the general init/teardown flow.
| if (result == UR_RESULT_SUCCESS) { | ||
| const uint32_t NumAdapters = pNumAdapters ? *pNumAdapters : NumEntries; | ||
| for (uint32_t i = 0; i < NumAdapters; ++i) { | ||
| UR_CALL(getContext()->interceptor->holdAdapter(phAdapters[i])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If urAdapterGet() get called multiple times, then we would have duplicated handles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we needn't hold the handle of adapter, it seems ur loader has already hold this.
I'll investigate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every time you call urAdapterGet, an internal reference count is incremented and you need to decrement it using urAdapterRelease.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Sorry for the delay! I will add some tests in this PR. |
|
unified-runtime/source/loader/ur_lib.hpp
I had found where this logic is. |
Makes sense, please feel free to do that change. |
|
@pbalcer I think this fix is also needed for our next release, can you help to add a label for it? |
|
LLVM test: intel/llvm#14963 |
|
@AllanZyne at this point things will need to be cherry-picked into the release branch, https://github.com/oneapi-src/unified-runtime/tree/v0.10.x. No need to do that now though. Just make sure all the PRs that you want included are tagged with |
Ok, thanks for your information. I've checked CI tests in UR and LLVM, it seems they're unrelated to this PR. |
yingcong-wu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than a commenting problem.
zhaomaosu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
Hi @oneapi-src/unified-runtime-maintain, since this PR is an essential bugfix for next release, can you help to review and merge this PR ASAP? |
aarongreig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending comments
|
Hi @oneapi-src/unified-runtime-maintain, should I cherry pick this PR by myself? |
The bump for this was included in intel/llvm#15101 which is already on our list to cherry-pick |
[DeviceSanitizer] Fix interceptor destruction order
Adapters are released earlier than loader by SYCL Runtime, so we hold them in interceptor to prevent crash