-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements accumulation functions in dpctl.tensor #1602
Conversation
…t.cumulative_sum` The Python bindings for these functions are implemented in a new submodule `_tensor_accumulation_impl`
53e927d
to
761ecd4
Compare
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully. |
1 similar comment
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully. |
This resolves hangs in unique functions
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_156 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_158 ran successfully. |
… trailing axis Fixes a bug where in some cases output axes were not being permuted
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_161 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_163 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_165 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_167 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_188 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_189 ran successfully. |
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
cc3d88f
to
b88f8ee
Compare
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
b88f8ee
to
5fd506c
Compare
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_191 ran successfully. |
Indexers are made const, integral variables in kernels made const too Make two-offset instances const references to avoid copying. Gor rid of get_src_const_ptr unused methods in stack_t structs. Replaced auto with size_t as appropriate. Added const to make compiler analysis easier (and faster).
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully. |
…ting against closed form
b1219af
to
4bd02b4
Compare
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully. |
By returning data from `local_mem_acc` after the group barrier, if memory is later overwritten, a race condition follows, which was especially obvious on CPU Now the value is stored in variable before the barrier and then returned
…ase size of test_logcumsumexp_basic
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_192 ran successfully. |
…ion (#1624) added comments explaining why barriers are needed
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully. |
Add empty new line after list item to make Sphinx happy.
Docstring edits for accumulation functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT! Thank you @ndgrigorian
Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_195 ran successfully. |
This pull request proposes implementation of
dpctl.tensor.cumulative_sum
,dpctl.tensor.cumulative_prod
, anddpctl.tensor.cumulative_logsumexp
.cumulative_sum
is already part of the array API standard and is implemented as per the spec.cumulative_prod
andcumulative_logsumexp
are implemented with a similar API, including aninclude_initial
keyword argument.