Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compatibility with CUDA 12.8 #919

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stephenswat
Copy link
Member

This commit makes traccc compatible with CUDA 12.8 through three changes:

  1. It removes the standalone translation units containing templated kernels, instead wrapping them in simple C++ wrapper functions which gives them more robust linkage.
  2. It sets the default CUDA architecture to CC7.5, although bugs in the build systems of dependencies are currently overriding this flag.
  3. It adds a note to the troubleshooting section about an incompatibility between traccc and CUDA 12.8 in debug mode.

This commit makes traccc compatible with CUDA 12.8 through three
changes:

1. It removes the standalone translation units containing templated
   kernels, instead wrapping them in simple C++ wrapper functions which
   gives them more robust linkage.
2. It sets the default CUDA architecture to CC7.5, although bugs in the
   build systems of dependencies are currently overriding this flag.
3. It adds a note to the troubleshooting section about an
   incompatibility between traccc and CUDA 12.8 in debug mode.
@stephenswat stephenswat added build This relates to the build system cuda Changes related to CUDA labels Mar 19, 2025
@beomki-yeo
Copy link
Contributor

Was CKF the only algorithm incompatible with CUDA 12.8? For example, fit function is also templated with detector type

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the way that things are done by now, wouldn't it be easier to:

  • Add one .hpp file per kernel, which lists all the overloaded versions of the functions, without any templating;
  • Add one .cu file per kernel, which has a templated __global__ function, and the overloaded C++ functions calling that kernel appropriately.

?

I.e. similar to that I've done in traccc::core and traccc::sycl? I think some of the templating here is not really helping us anymore.

@@ -24,7 +24,7 @@ if( "${CMAKE_CUDA_COMPILER_ID}" MATCHES "NVIDIA" )
endif()

# Set the CUDA architecture to build code for.
set( CMAKE_CUDA_ARCHITECTURES "52" CACHE STRING
set( CMAKE_CUDA_ARCHITECTURES "75" CACHE STRING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesh. Unfortunately the include order in the project makes this indeed ineffective. 😦

In a separate PR we should change the setup a little. Moving such global settings simply to the main CMakeLists.txt file. Such that they would be guaranteed to precede whatever version some of the externals are setting up.

Note that I've been setting CMAKE_CUDA_ARCHITECTURES in CMakeUserPresets.json recently to set for instance "native" in my local builds. (Just to give ideas.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say the superior option would really be to have the dependent projects not set any global CMake variables!

@@ -17,12 +17,20 @@
namespace traccc::cuda::kernels {

template <typename detector_t>
__global__ void apply_interaction(
__global__ void _apply_interaction(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super happy about this naming. 🤔

My first idea would've been to give some new name to the C++ functions, and keep the name of the kernels the same.

But another option could be to call the actual kernel something like traccc::cuda::kernels::impl::apply_interaction. 🤔

I'm just not a fan of using functions with underscore prefixes. 😦

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do this in a little impl namespace. 👍

<<<nBlocks, nThreads, 0, stream>>>(
m_cfg, {det_view, n_in_params, in_params_buffer,
param_liveness_buffer});
kernels::apply_interaction<std::decay_t<detector_type>>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be possible to come up with a design in which the called (templated) function could deduce its template parameters? 🤔 Unfortunately none of the provided parameters are detector_type directly. But maybe some formalism could still be found for such a setup.

It would just make the API of the function less error prone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be doable after #921.

__global__ void apply_interaction(
const finding_config cfg,
device::apply_interaction_payload<detector_t> payload);
void apply_interaction(const dim3& grid_size, const dim3& block_size,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are now C++ headers. Let's rename them to .hpp! Or do they still in turn include some .cuh files of their own? 😕

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do that; although they do need dim3 from the cUDA runtime, but we can include that from a .hpp as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build This relates to the build system cuda Changes related to CUDA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants