-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command-buffer queue comptabaility #1142
Comments
Need to consider interactions with PR #850. |
I've attempted to form my current thoughts into a new proposal based around idea 1. The Command-buffer AppendSemantics of command-queue properties set on the queue passed when appending commands.
Command-buffer EnqueueSemantics of command-queue properties affecting
Changes to what a compatible queue means
Edit: added strike-through formatting to proposal component no longer required based on #1142 (comment) |
On the 11/04/24 teleconference it was discussed whether we need this part of the proposal, as it complicates spec language by needing to say whether it's invalid/ignored if you mismatch a profiling command-buffer with profiling queue. e.g. command-buffer without profiling to a profiling queue, or command-buffer with profiling to a non-profiling queue. Thinking about this some more, I don't think we do need to specify upfront on creation if a command-buffer is profilable or not. It can be implemented by the runtime bookending the command-buffer submission with command submissions to the queue which will give the relevant Background info why I initially suggested this:
Using L0 to illustrate (based loosely on what's described here):
|
As proposed in KhronosGroup#1142 the PR changes the semantics of the command-queues parameters used for command-buffer creation and enqueue. The queues used on command-buffer creation now only inform the device and dependencies of commands, rather than restricting the properties set on the queues used for command-buffer enqueue.
As proposed in KhronosGroup#1142 the PR changes the semantics of the command-queues parameters used for command-buffer creation and enqueue. The queues used on command-buffer creation now only inform the device and dependencies of commands, rather than restricting the properties set on the queues used for command-buffer enqueue. This is based ontop on the change in KhronosGroup#850 to add supported queue property semantics.
Draft opened here #1292 Ideas for additional CTS tests:
|
The current
clCreateCommandBufferKHR
API takes a list of queues on creation, and enforces that the properties of these queues must match the properties of the matching queues passed inclEnqueueCommandBufferKHR
. However, this API design does not work well for applications that do not know at the time of command-buffer creation what the properties of the queue that the command-buffer will be executed on are.In particular this is an issue for SYCL-Graph layering, where when the command-buffer is created during
graph.finalize()
, the SYCL-RT does not yet know what SYCL queue a user will submit the graph to. The graph is treated as a black box by the SYCL queue that is executed in-order or out-of-order with respect to the other SYCL queue commands depending on the type of queue.There are two possible approaches to help with this in cl_khr_command_buffer that I've thought about so far:
Optional feature to relax queue compatibility constraint - Add an optional feature to relax the constraint that the queues passed to
clEnqueueCommandBuffer
must be compatible with those set on creation (or maybe a query to check for "compatible" means for that vendor). The SYCL-RT could already use theclRemapCommandBufferKHR()
API withCL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR
for this to created a remap of the command-buffer before each enqueue during graph submission, but we don't really want to incur the deep copy cost here and thecl_khr_command_buffer_multi_device
extension also doesn't have a large amount of vendor coverage. However, a drawback of this is that the SYCL-RT would still need to create a placeholder OpenCL queue when the command-buffer is created and commands added, which has a construction overhead.Remove the queue object from the command-buffer interface enqueue time. We could replace the command-queue parameters with device parameters (and a context parameter on creation) on command-queue creation/command adding/remap and then only introduce the queues at
clEnqueueCommandBuffer
time. We'd probably still want more command-buffer properties so that a user can say up-front what type of queue they want the command-buffer to run on if they want a more optimized implementation. This is quite an invasive change.I haven't really thought all this through yet but the design point has come up recently so I think it's worth creating an issue to track the discussion.
The text was updated successfully, but these errors were encountered: