Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA] add non-uniform groups and algorithms support for ext_oneapi_cuda #9182

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
c0b57f9
[SYCL][NFC] Prepare algorithms for non-uniformity
Pennycook Mar 27, 2023
483ef0b
[SYCL][NFC] Adjust some detail namespaces
Pennycook Mar 27, 2023
2e15678
[SYCL] Add ballot_group support to algorithms
Pennycook Mar 27, 2023
64e3f9f
SYCL_EXTERNAL => __DPCPP_SYCL_EXTERNAL
Pennycook Mar 27, 2023
2ec7198
Add extra include for vec<>
Pennycook Mar 27, 2023
a25bdd2
Do not mix struct/class in definitions
Pennycook Mar 27, 2023
8dc124a
Fix a few missed renames
Pennycook Mar 27, 2023
af983f4
Ensure Scope and GroupOperation are constexpr
Pennycook Mar 27, 2023
dfe2e50
Merge branch 'sycl' into ballot_group_algorithms
Pennycook Mar 28, 2023
01ecf06
Fix nested detail:: namespace for group_ballot
Pennycook Mar 28, 2023
b2a4a11
Add basic tests for non-uniform groups
Pennycook Mar 29, 2023
68ab5bc
Add tests for ballot_group algorithms
Pennycook Mar 29, 2023
56e05ce
Clarify intent of ballot_group control flow branch
Pennycook Mar 29, 2023
c546762
Initial partially working nvptx ballot_group algs.
JackAKirk Apr 1, 2023
6b65429
Add skeleton for masked cuda reductions.
JackAKirk Apr 17, 2023
d3df184
Working redux impls for float/double/int for cluster_group.
JackAKirk Apr 17, 2023
1cba463
Merge branch 'sycl' into add-cuda-ballot
Apr 24, 2023
10e5f12
Remove cluster_group.cpp test.
Apr 24, 2023
8e8cb06
reduce_over_group for ballot/opportunistic group shfl impl.
Apr 27, 2023
c2fe96c
fix fixed_size_group
May 2, 2023
4bfdd38
Merge branch 'sycl' into add-cuda-ballot
May 2, 2023
8d13656
draft scans fixed_size_group
May 3, 2023
e062982
Finished draft impl for all algorithms fully working
May 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion libclc/ptx-nvidiacl/libspirv/SOURCES
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ images/image_helpers.ll
images/image.cl
group/collectives_helpers.ll
group/collectives.cl
group/group_ballot.cl
group/group_non_uniform.cl
atomic/atomic_add.cl
atomic/atomic_and.cl
atomic/atomic_cmpxchg.cl
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
//===----------------------------------------------------------------------===//

#include "membermask.h"
#include <integer/popcount.h>

#include <spirv/spirv.h>
#include <spirv/spirv_types.h>
Expand Down Expand Up @@ -34,3 +35,9 @@ _Z29__spirv_GroupNonUniformBallotjb(unsigned flag, bool predicate) {

return res;
}

_CLC_DEF _CLC_CONVERGENT uint
_Z37__spirv_GroupNonUniformBallotBitCountN5__spv5Scope4FlagEiDv4_j(
uint scope, uint flag, __clc_vec4_uint32_t mask) {
return __clc_native_popcount(__nvvm_read_ptx_sreg_lanemask_lt() & mask[0]);
}
Loading