stop batching multiple CmdList execution when waiting for barriers#1555
Merged
kbenzie merged 1 commit intooneapi-src:mainfrom May 3, 2024
Merged
stop batching multiple CmdList execution when waiting for barriers#1555kbenzie merged 1 commit intooneapi-src:mainfrom
kbenzie merged 1 commit intooneapi-src:mainfrom
Conversation
Contributor
Author
Contributor
|
@oneapi-src/unified-runtime-level-zero-write would be great to get some reviews on this so I can go ahead with merging everything and adding it to #1547 |
igchor
approved these changes
May 2, 2024
The current implementation of queue command batching appears to assume that getAvailableCommandList is followed by a single executeCommandList, and it's not allowed to, for example, create a list of CmdLists for later execution. And there's an assert to check for this. This makes sense, since it doesn't really make sense to batch things across many different CmdLists. However, the implementation of urEnqueueEventsWaitWithBarrier was doing exactly that. This patch disables batching in executeCommandList, fixing a assert failure during urEnqueueEventsWaitWithBarrier when many potentially open command lists are in use.
6015a32 to
112089f
Compare
5 tasks
kbenzie
added a commit
to kbenzie/unified-runtime
that referenced
this pull request
May 3, 2024
stop batching multiple CmdList execution when waiting for barriers
Contributor
|
I've removed this from v0.9.1rc due to these issues |
pbalcer
added a commit
to pbalcer/unified-runtime
that referenced
this pull request
May 7, 2024
This is another attempt to fix an issue where the L0 Adapter crashes when urEnqueueEventsWaitWithBarrier in presence of multiple active command queues with batched cmdlists. The core issue is that the current queue implementation only allows for two command lists to be active open batches, one for copy and one for compute. If that assumption doesn't hold, the getAvailableCommandList function, when executed multiple times for a command queue of the same type, will override the active open command batch. So only the last retrieved command list can actually be batched. After this, when the code attempts to execute all the command lists it collected, with batching enabled. And this is where we hit an assert because the active open command list doesn't match what is being used. The proper fix here is to allow open command batches for each command queue. But that's a fairly risky change to do this late in the release cycle. My previous attempt at a fix simply disabled batching for queue-wide barriers (oneapi-src#1555). That introduced regressions in tests that assumed that batching happens. It might also been a performance regression. Instead, this patch fixes getAvailableCommandList when batching is enabled and specific command queue is required, and disables batching only for cases where the open batch cmdlist is different than the one we are executing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The current implementation of queue command batching appears to assume that getAvailableCommandList is followed by a single executeCommandList, and it's not allowed to, for example, create a list of CmdLists for later execution. And there's an assert to check for this.
This makes sense, since it doesn't really make sense to batch things across many different CmdLists. However, the implementation of urEnqueueEventsWaitWithBarrier was doing exactly that.
This patch disables batching in executeCommandList, fixing a assert failure during urEnqueueEventsWaitWithBarrier when many potentially open command lists are in use.