Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make alpaka work with SYCL #785

Merged
merged 9 commits into from
Dec 6, 2024
Merged

Conversation

StewMH
Copy link
Contributor

@StewMH StewMH commented Nov 26, 2024

I added a queue argument for vecmem::sycl::copy. Not in a very clean way, but this all needs cleaned up (as discussed with @CrossR) and hopefully coming soon in an MR from him or me.

With this I can get the alpaka seeding example to work on the CERN GPU Flex 170:

==> Statistics ...
- read    77602 spacepoints
- created  (cpu)  0 seeds
- created (alpaka)  66540 seeds
==>Elapsed times...
            Hit reading  (cpu)  1012 ms
              Seeding (alpaka)  6472 ms
         Track params (alpaka)  1847 ms
                     Wall time  9334 ms

It's very slow compared to the CUDA implementation, but I think this is largely because I had to enable FP64 emulation:

  • Some Intel GPUs do not support the double type for device code. alpaka will not check this.
    You can enable software emulation for double precision types with
export IGC_EnableDPEmulation=1
export OverrideDefaultFP64Settings=1

@StewMH StewMH marked this pull request as ready for review November 26, 2024 16:22
@StewMH StewMH force-pushed the sycl_queue_wrapper branch from 2d36a05 to ee005f9 Compare November 26, 2024 17:17
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, very happy about the developments. But we'll indeed need to find some good way of passing in the common ::sycl::queue from the outside to all the SYCL objects.

I also wonder where we could pass a queue to Alpaka. Or how we could get the queue that Alpaka uses. Since all of these will have to be synced up. 🤔

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but we should try to find a way of using "factory functions" for instantiating the appropriate type of memory resource and copy objects. To eliminate all appearances of preprocessor statements from the "user code".

But for this PR, as long as you clean up the CMake code a little using a new macro, I'll be happy to let it through. (The change in the main CMakeLists.txt file affects others as well, that's why I'd like to get that fixed.)

Copy link

sonarqubecloud bot commented Dec 4, 2024

Please retry analysis of this Pull-Request directly on SonarQube Cloud

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this setup I can live with. 😄

@StewMH
Copy link
Contributor Author

StewMH commented Dec 5, 2024

Think this is good to go now

@krasznaa krasznaa merged commit c9d6828 into acts-project:main Dec 6, 2024
27 checks passed
@krasznaa
Copy link
Member

krasznaa commented Dec 6, 2024

Sorry, didn't realize that you would not be allowed to merge once I approved it. 😛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants