-
Notifications
You must be signed in to change notification settings - Fork 769
[UR] [L0 v2] Enable wait lists and signal events for command buffer in L0 adapter v2 #18456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
741adb4
to
5a2beb0
Compare
To clarify this, the functionality added in this PR of exposing wait and signal events also isn't exposed in SYCL yet. It's not only updating signal/wait events of commands that isn't exposed. |
@@ -232,6 +237,14 @@ ur_result_t urEventRelease(ur_event_handle_t hEvent) try { | |||
ur_result_t urEventWait(uint32_t numEvents, | |||
const ur_event_handle_t *phEventWaitList) try { | |||
for (uint32_t i = 0; i < numEvents; ++i) { | |||
if (!phEventWaitList[i]->getIsEventInUse()) { | |||
// TODO: This is a workaround for the underlying inconsistency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeating comment from the original PR: can't we manually signal the events to put them in a proper state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not possible to signal or reset counter-based events from host. They also need to be previously used as part of another command append before they are usable.
PARAMETERIZATION(write_2d_3d, 256, 1024, (ur_rect_offset_t{1, 2, 0}), | ||
(ur_rect_offset_t{4, 1, 3}), (ur_rect_region_t{4, 16, 1}), 8, | ||
256, 8, 256); | ||
// PARAMETERIZATION(write_2d_3d, 256, 1024, (ur_rect_offset_t{1, 2, 0}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this test commented out? Is there an issue to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a test that started to fail recently, and is a part of a larger problem (#17187)
@@ -912,6 +913,7 @@ ur_result_t ur_queue_immediate_in_order_t::enqueueCommandBufferExp( | |||
1, &commandBufferCommandList, phEvent, numEventsInWaitList, | |||
phEventWaitList, UR_COMMAND_ENQUEUE_COMMAND_BUFFER_EXP, executionEvent)); | |||
UR_CALL(hCommandBuffer->registerExecutionEventUnlocked(*phEvent)); | |||
hCommandBuffer->enableEvents(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to iterate through all the events when enqueueing might have a performance cost for the first time that the command buffer is enqueued. My understanding is that this is supposed to be temporary and will be removed in the future? Can we add a TODO here that mentions that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this fix is supposed to be temporary, lasting until driver team takes care of it, and sure, I will add that TODO.
|
||
uint32_t numWaitEventsEnabled = 0; | ||
if (isImmediateCommandList) { | ||
for (uint32_t i = 0; i < numWaitEvents; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid doing that on each getWaitListView - this might impact performance. I'm wondering if it wouldn't be better to just always use regular events for command buffers... We wouldn't need that isInUse() workaround at all then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature this PR is adding isn't being exposed in SYCL-Graph right now, always using regular events rather than counter based events for command-buffers would compromise the performance of applications for all SYCL-Graph usage today. Even once it is exposed, it'll be a more niche use-case, which seems a bad tradeoff to limit the more general performance for.
Truncated PR #18442. The mutable signal event and wait list functionality has been removed, since it is not used by SYCL and does not seem to be working properly.