Skip to content

[UR] [L0 v2] Enable wait lists and signal events for command buffer in L0 adapter v2 #18456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: sycl
Choose a base branch
from

Conversation

Xewar313
Copy link
Contributor

Truncated PR #18442. The mutable signal event and wait list functionality has been removed, since it is not used by SYCL and does not seem to be working properly.

@Xewar313 Xewar313 requested review from a team as code owners May 14, 2025 08:45
@Xewar313 Xewar313 requested a review from fabiomestre May 14, 2025 08:45
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 08:46 — with GitHub Actions Inactive
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 09:06 — with GitHub Actions Inactive
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 09:06 — with GitHub Actions Inactive
@Xewar313 Xewar313 force-pushed the enable-waitlist-and-signal-bare branch from 741adb4 to 5a2beb0 Compare May 14, 2025 09:20
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 09:21 — with GitHub Actions Inactive
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 09:50 — with GitHub Actions Inactive
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 10:12 — with GitHub Actions Inactive
@Xewar313 Xewar313 temporarily deployed to WindowsCILock May 14, 2025 10:12 — with GitHub Actions Inactive
@EwanC
Copy link
Contributor

EwanC commented May 14, 2025

Truncated PR #18442. The mutable signal event and wait list functionality has been removed, since it is not used by SYCL and does not seem to be working properly.

To clarify this, the functionality added in this PR of exposing wait and signal events also isn't exposed in SYCL yet. It's not only updating signal/wait events of commands that isn't exposed.

@@ -232,6 +237,14 @@ ur_result_t urEventRelease(ur_event_handle_t hEvent) try {
ur_result_t urEventWait(uint32_t numEvents,
const ur_event_handle_t *phEventWaitList) try {
for (uint32_t i = 0; i < numEvents; ++i) {
if (!phEventWaitList[i]->getIsEventInUse()) {
// TODO: This is a workaround for the underlying inconsistency
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating comment from the original PR: can't we manually signal the events to put them in a proper state?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not possible to signal or reset counter-based events from host. They also need to be previously used as part of another command append before they are usable.

PARAMETERIZATION(write_2d_3d, 256, 1024, (ur_rect_offset_t{1, 2, 0}),
(ur_rect_offset_t{4, 1, 3}), (ur_rect_region_t{4, 16, 1}), 8,
256, 8, 256);
// PARAMETERIZATION(write_2d_3d, 256, 1024, (ur_rect_offset_t{1, 2, 0}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test commented out? Is there an issue to track this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test that started to fail recently, and is a part of a larger problem (#17187)

@@ -912,6 +913,7 @@ ur_result_t ur_queue_immediate_in_order_t::enqueueCommandBufferExp(
1, &commandBufferCommandList, phEvent, numEventsInWaitList,
phEventWaitList, UR_COMMAND_ENQUEUE_COMMAND_BUFFER_EXP, executionEvent));
UR_CALL(hCommandBuffer->registerExecutionEventUnlocked(*phEvent));
hCommandBuffer->enableEvents();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to iterate through all the events when enqueueing might have a performance cost for the first time that the command buffer is enqueued. My understanding is that this is supposed to be temporary and will be removed in the future? Can we add a TODO here that mentions that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this fix is supposed to be temporary, lasting until driver team takes care of it, and sure, I will add that TODO.


uint32_t numWaitEventsEnabled = 0;
if (isImmediateCommandList) {
for (uint32_t i = 0; i < numWaitEvents; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid doing that on each getWaitListView - this might impact performance. I'm wondering if it wouldn't be better to just always use regular events for command buffers... We wouldn't need that isInUse() workaround at all then.

Copy link
Contributor

@EwanC EwanC May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature this PR is adding isn't being exposed in SYCL-Graph right now, always using regular events rather than counter based events for command-buffers would compromise the performance of applications for all SYCL-Graph usage today. Even once it is exposed, it'll be a more niche use-case, which seems a bad tradeoff to limit the more general performance for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants