Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[alpaka] Add support for the SYCL back-end #407

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

AuroraPerego
Copy link
Contributor

With #1845 and other following contributions, alpaka now supports also SYCL/oneAPI as a back-end to target CPUs, Intel GPUs and FPGAs.
This PR propagates the support for SYCL also in pixeltrack.
Some comments:

  • the new back-ends available are --syclcpu and --syclgpu
  • in the Makefile, the linking of the libraries and of the final target must be done with the Intel compiler (when available)
  • the develop branch of alpaka is cloned at a newer commit
  • there are some workarounds for bugs/not supported features in sycl (all_of_group / any_of_group on CPU, SYCL kernels do not support zero-lenght arrays, device global variables are not supported in the SYCL back-end yet)
  • common interface for math functions (SYCL wants math function in the sycl namespace)
  • a trait for the warp size has been added to force it to 32 when needed

In addition, the vendor-specific RNG support in alpaka has been disabled.

fwyzard and others added 13 commits August 28, 2023 02:46
additional sources are necessary to use gdb-oneapi
the same is true for the other tools (advisor, inspector, vtune...) but since they are useless there is no point in sourcing them as well
The application is compiled once for the CPU(s) and once for the Intel GPU(s).
The flags for AOT compilation have been added and icpx is used as the default compiler for the SYCL backend.
The linking step is performed as well with icpx when the SYCL backend is enabled.
commit 819974ddc5b2eb4b33e709bd317701793cdb7d15
Author: Jan Stephan <j.stephan@hzdr.de>
Date:   Thu Aug 3 14:20:59 2023 +0200

    Always use std::size_t for CUDA pitch calculations
Changed the order to make the ONEAPI compiler happy
For the other backends it is mapped to ALPAKA_STATIC_ACC_MEM_CONSTANT, but global variables are not supported yet in the SYCL backend.
However, in this case a `constexpr` is enough to obtain the same result
- added the allocator policy: Caching or Synchronous
- added allocCachedBuf: TODO implement pitch in SYCL
- fixed cout because SYCL events in alpaka are not shared pointers
- do not cache SYCL events
- Changed size of CountersOnly to 1 because SYCL kernels do not support zero-lenght arrays
- implemented HostOnlyTask using `alpaka/core/CallbackThread`
- adapt `prefixScan` and `radixSort` to SYCL: work around the function pointer not supported in SYCl kernels
- implemented the work division
- add the `--syclcpu` and `--syclgpu` options for the backends
Math functions are defined in the `namespace math` and are taken from the `sycl namespace` for the SYCL backend, from the global namespace for the CUDA and HIP backends and from the `std namespace` in every other case
Add trait to set the warp size to 32 for the kernels that requires it.
Implemented in alpaka only for the SYCL backend at the moment, does nothing for the other backends
There is a bug with `any_of_group` / `all_of_group` in the OpenCL runtime that can be worked around setting the sub-group size equal to the block size.
The bug has been solved in the latest runtimes, but the application hangs with these..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants