Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements accumulation functions in dpctl.tensor #1602

Merged
merged 36 commits into from
Apr 1, 2024

Conversation

ndgrigorian
Copy link
Collaborator

@ndgrigorian ndgrigorian commented Mar 23, 2024

This pull request proposes implementation of dpctl.tensor.cumulative_sum, dpctl.tensor.cumulative_prod, and dpctl.tensor.cumulative_logsumexp.

cumulative_sum is already part of the array API standard and is implemented as per the spec. cumulative_prod and cumulative_logsumexp are implemented with a similar API, including an include_initial keyword argument.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

…t.cumulative_sum`

The Python bindings for these functions are implemented in a new submodule `_tensor_accumulation_impl`
@ndgrigorian ndgrigorian force-pushed the feature/tensor-accumulation branch from 53e927d to 761ecd4 Compare March 23, 2024 21:37
Copy link

github-actions bot commented Mar 23, 2024

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

1 similar comment
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_154 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

This resolves hangs in unique functions
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_156 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

@coveralls
Copy link
Collaborator

coveralls commented Mar 25, 2024

Coverage Status

coverage: 87.967% (+0.08%) from 87.884%
when pulling 27eb063 on feature/tensor-accumulation
into 57495af on master.

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_158 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

@ndgrigorian ndgrigorian marked this pull request as ready for review March 26, 2024 18:18
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_161 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_163 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_165 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_167 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_188 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_189 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

@oleksandr-pavlyk oleksandr-pavlyk force-pushed the feature/tensor-accumulation branch from cc3d88f to b88f8ee Compare March 30, 2024 15:03
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

@oleksandr-pavlyk oleksandr-pavlyk force-pushed the feature/tensor-accumulation branch from b88f8ee to 5fd506c Compare March 30, 2024 16:11
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_191 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Indexers are made const, integral variables in kernels made const too

Make two-offset instances const references to avoid copying.

Gor rid of get_src_const_ptr unused methods in stack_t structs.
Replaced auto with size_t as appropriate. Added const to make
compiler analysis easier (and faster).
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

@oleksandr-pavlyk oleksandr-pavlyk force-pushed the feature/tensor-accumulation branch from b1219af to 4bd02b4 Compare March 31, 2024 16:39
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_190 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

By returning data from `local_mem_acc` after the group barrier, if memory is later overwritten, a race condition follows, which was especially obvious on CPU

Now the value is stored in variable before the barrier and then returned
Copy link

github-actions bot commented Apr 1, 2024

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_192 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

…ion (#1624)

added comments explaining why barriers are needed
Copy link

github-actions bot commented Apr 1, 2024

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_193 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

Add empty new line after list item to make Sphinx happy.
Docstring edits for accumulation functions
Copy link
Collaborator

@oleksandr-pavlyk oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT! Thank you @ndgrigorian

Copy link

github-actions bot commented Apr 1, 2024

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_195 ran successfully.
Passed: 839
Failed: 0
Skipped: 92

@ndgrigorian ndgrigorian enabled auto-merge (squash) April 1, 2024 16:48
@ndgrigorian ndgrigorian disabled auto-merge April 1, 2024 16:48
@ndgrigorian ndgrigorian merged commit 65bb9ef into master Apr 1, 2024
34 of 39 checks passed
@ndgrigorian ndgrigorian deleted the feature/tensor-accumulation branch April 1, 2024 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants