Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get coerced usm type #797

Merged
merged 3 commits into from
Mar 22, 2022
Merged

Get coerced usm type #797

merged 3 commits into from
Mar 22, 2022

Conversation

oleksandr-pavlyk
Copy link
Collaborator

@oleksandr-pavlyk oleksandr-pavlyk commented Mar 22, 2022

Adds dpctl.utils.get_coerced_usm_type to deduce the usm type for the output array based on usm types of input arrays.

Also modified behavior of dpctl.utils.get_execution_queue. The notion of equivalency has changed. Now two queues are equivalent if they compare equal, i.e. q1 == q2. This is true when the underlying C++ class instances are copies of the same underlying instance.

Note that dpctl.SyclQueue() == dpctl.SyclQueue() returns False, and so would dpctl.SyclQueue("cpu") == dpctl.SyclQueue("cpu").

To bring two arrays to equivalent queues, do Yc = Y.to_device(X.device), and then foo(X, Yc) if foo(X, Y) was raising due to incompatible queues.

This function deduces the type of USM allocation for the output array
based on USM types of the input arrays.

Changed behavior of get_execution_queue(sequence_of_queues).

Queues are equivalent now when they compare equal, i.e. q1 == q2
is True. This means that they are Python wrappers about copies of
the same underlying SYCL queue.

Since dpctl.SyclQueue() and dpctl.SyclQueue() produce two distinct instances
they no longer are equivalent, and will not be compatible in the compute follows
data paradigm.

arr.to_device can be used to zero-copy reattach the new queue to the data provided
that the original queue and the new queue have the same underlying SYCL context,
which guarantees that SYCL pointer can be dereferenced in the new queue.
@oleksandr-pavlyk
Copy link
Collaborator Author

A forthcoming PR would implement caching for the mapping from filter strings and unpartitioned devices to queues.

@oleksandr-pavlyk
Copy link
Collaborator Author

@PokhodenkoSA @Alexander-Makaryev @diptorupd The change to dpctl.utils.get_execution_queue is backward incompatible and may require changes in dpnp and numba-dpex tests. Implementation of aforementioned caching should ease the burden.

I would therefore not merge this until such caching is implemented.

@github-actions
Copy link

@coveralls
Copy link
Collaborator

coveralls commented Mar 22, 2022

Coverage Status

Coverage increased (+0.02%) to 81.907% when pulling 99f017d on get-coerced-usm-type into c274840 on master.

…vices to queues

As a result of this `X = dpt.empty((10,), device="gpu")` and `Y=dpt.empty((10,), device="gpu")`
would have the same associated queue, i.e. `X.sycl_queue is Y.sycl_queue` would give True
@oleksandr-pavlyk
Copy link
Collaborator Author

I would therefore not merge this until such caching is implemented.

The caching has been implemented now:

In [1]: import dpctl.tensor as dpt

In [3]: X = dpt.empty((10,))

In [4]: Y = dpt.empty((5,2), 'd', usm_type="shared")

In [5]: X.sycl_queue is Y.sycl_queue
Out[5]: True

In [6]: X = dpt.empty((10,), device="cpu")

In [7]: Y = dpt.empty((5,2), 'd', usm_type="shared", device="opencl:cpu:0")

In [8]: X.sycl_queue is Y.sycl_queue
Out[8]: True

@oleksandr-pavlyk oleksandr-pavlyk merged commit e96cce3 into master Mar 22, 2022
@oleksandr-pavlyk oleksandr-pavlyk deleted the get-coerced-usm-type branch March 22, 2022 10:30
@github-actions
Copy link

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants