[L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 #1412

keyradical · 2024-03-05T17:20:15Z

LevelZero changes:

Extend the functionality of enqueueMemFillHelper to allow calling it with pattern sizes which are not powers of 2. In those cases filling is emulated with copying.

OpenCL changes:

Add a condition of isPowerOf2 to the USM fill function

Those changes are necessary for the PR: intel/llvm#12702 which refactors queue.fill() to make use of the urEnqueueUSMFill.

intel/llvm CI: intel/llvm#12912

codecov-commenter · 2024-03-05T18:12:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 12.51%. Comparing base (78ef1ca) to head (293b670).
Report is 109 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1412      +/-   ##
==========================================
- Coverage   14.82%   12.51%   -2.32%     
==========================================
  Files         250      239      -11     
  Lines       36220    35949     -271     
  Branches     4094     4076      -18     
==========================================
- Hits         5369     4498     -871     
- Misses      30800    31447     +647     
+ Partials       51        4      -47

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

keyradical · 2024-03-26T08:42:51Z

friendly ping @oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-opencl-write, could I get a review on this please?

nrspruit

LGTM

…f 2 (#12912) oneapi-src/unified-runtime#1412 --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>

This PR changes the `queue.fill()` implementation to make use of the native functions for a specific backend. It also unifies that implementation with the one for memset, since it is just an 8-bit subset operation of fill. In the CUDA case, both memset and fill are currently calling `urEnqueueUSMFill` which depending on the size of the filling pattern calls either `cuMemsetD8Async`, `cuMemsetD16Async`, `cuMemsetD32Async` or `commonMemSetLargePattern`. Before this patch memset was using the same thing, just beforehand setting patternSize always to 1 byte which resulted in calling `cuMemsetD8Async`. In other backends, the behaviour is analogous. The fill method was just invoking a `parallel_for` to fill the memory with the pattern which was making this operation quite slow. This PR depends on: - oneapi-src/unified-runtime#1395 - oneapi-src/unified-runtime#1412

keyradical requested a review from a team as a code owner March 5, 2024 17:20

This was referenced Mar 5, 2024

[SYCL] Make queue fill use native functions intel/llvm#12702

Merged

[UR] CI for: Emulate Fill with copy when patternSize is not a power of 2 intel/llvm#12912

Merged

keyradical force-pushed the memsetLargePatternL0 branch from 59cdd8a to 293b670 Compare March 8, 2024 12:37

keyradical requested a review from a team as a code owner March 8, 2024 13:32

keyradical changed the title ~~[L0] Emulate Fill with copy when patternSize is not a power of 2~~ [L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 Mar 8, 2024

keyradical force-pushed the memsetLargePatternL0 branch from 3887dfd to ac0274f Compare March 22, 2024 16:32

aarongreig approved these changes Mar 26, 2024

View reviewed changes

nrspruit approved these changes Mar 26, 2024

View reviewed changes

keyradical added the ready to merge Added to PR's which are ready to merge label Mar 26, 2024

keyradical force-pushed the memsetLargePatternL0 branch 2 times, most recently from a6f5dfe to 617d8ad Compare March 27, 2024 14:05

kbenzie added level-zero L0 adapter specific issues opencl OpenCL adapter specific issues labels Apr 3, 2024

keyradical force-pushed the memsetLargePatternL0 branch from 617d8ad to 38037e9 Compare April 15, 2024 08:56

keyradical force-pushed the memsetLargePatternL0 branch 2 times, most recently from 9de2356 to e4a8d29 Compare April 29, 2024 15:05

Konrad Kusiak added 3 commits April 30, 2024 16:31

Emulated Fill with copy when patternSize is not a power of 2

3948742

Added condition with isPowerOf2 to opencl Fill

08f4c75

Adjusted urPrint to logger::debug according to newest changes

2727e8a

kbenzie force-pushed the memsetLargePatternL0 branch from 41ab8b4 to 2727e8a Compare April 30, 2024 15:31

kbenzie merged commit 633ec40 into oneapi-src:main Apr 30, 2024

sommerlukas pushed a commit to intel/llvm that referenced this pull request May 2, 2024

[UR] CI for: Emulate Fill with copy when patternSize is not a power o…

f34a650

…f 2 (#12912) oneapi-src/unified-runtime#1412 --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>

keyradical mentioned this pull request May 14, 2024

[OpenCL] Modify fill emulation to work for patterns which are not powers of 2 #1603

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 #1412

[L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 #1412

Uh oh!

keyradical commented Mar 5, 2024 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 5, 2024 •

edited

Loading

Uh oh!

keyradical commented Mar 26, 2024

Uh oh!

nrspruit left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 #1412

[L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 #1412

Uh oh!

Conversation

keyradical commented Mar 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Mar 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

keyradical commented Mar 26, 2024

Uh oh!

nrspruit left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

keyradical commented Mar 5, 2024 •

edited

Loading

codecov-commenter commented Mar 5, 2024 •

edited

Loading