-
Notifications
You must be signed in to change notification settings - Fork 31
Implements accumulation functions in dpctl.tensor #1602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…t.cumulative_sum` The Python bindings for these functions are implemented in a new submodule `_tensor_accumulation_impl`
53e927d to
761ecd4
Compare
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully. |
1 similar comment
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully. |
This resolves hangs in unique functions
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_156 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_158 ran successfully. |
… trailing axis Fixes a bug where in some cases output axes were not being permuted
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_161 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_163 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_165 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_167 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_188 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_189 ran successfully. |
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
cc3d88f to
b88f8ee
Compare
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
b88f8ee to
5fd506c
Compare
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_191 ran successfully. |
Indexers are made const, integral variables in kernels made const too Make two-offset instances const references to avoid copying. Gor rid of get_src_const_ptr unused methods in stack_t structs. Replaced auto with size_t as appropriate. Added const to make compiler analysis easier (and faster).
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully. |
…ting against closed form
b1219af to
4bd02b4
Compare
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
By returning data from `local_mem_acc` after the group barrier, if memory is later overwritten, a race condition follows, which was especially obvious on CPU Now the value is stored in variable before the barrier and then returned
…ase size of test_logcumsumexp_basic
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_192 ran successfully. |
…ion (#1624) added comments explaining why barriers are needed
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully. |
Add empty new line after list item to make Sphinx happy.
Docstring edits for accumulation functions
oleksandr-pavlyk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT! Thank you @ndgrigorian
|
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_195 ran successfully. |
This pull request proposes implementation of
dpctl.tensor.cumulative_sum,dpctl.tensor.cumulative_prod, anddpctl.tensor.cumulative_logsumexp.cumulative_sumis already part of the array API standard and is implemented as per the spec.cumulative_prodandcumulative_logsumexpare implemented with a similar API, including aninclude_initialkeyword argument.