Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

team_dot_complex_double fails on clang+cuda due to misalignment #645

Closed
brian-kelley opened this issue Mar 5, 2020 · 0 comments
Closed
Assignees

Comments

@brian-kelley
Copy link
Contributor

On kokkos-dev2, with Clang 8 and CUDA 10.0 (a spot check build), team_dot_complex_double fails because of cudaErrorMisalignedAddress. Since this passes in all other CUDA builds (even with 16 == alignof(Kokkos::complex<double>)), it seems likely this is just a compiler bug.

I can tell this error is caused by the KokkosBlas::Experimental::dot device function, since cudaGetLastError() reports the misaligned access right after the functor launches in the test Test_Blas1_team_dot.hpp, in impl_test_team_dot():

    Kokkos::fence();
    checkCuda();  // returns cudaSuccess
    Kokkos::parallel_for( "KokkosBlas::Test::TeamDot", policy, KOKKOS_LAMBDA ( const team_member &teamMember ) { 
       const int teamId = teamMember.league_rank();
       CHECKALIGN(&d_r(teamId));
       auto lhs = Kokkos::subview(a,Kokkos::make_pair(teamId*team_data_siz,(teamId < M-1)?(teamId+1)*team_data_siz:N));
       auto rhs = Kokkos::subview(b,Kokkos::make_pair(teamId*team_data_siz,(teamId < M-1)?(teamId+1)*team_data_siz:N));
       CHECKALIGN(lhs.data());
       CHECKALIGN(rhs.data());
       d_r(teamId) = KokkosBlas::Experimental::dot(teamMember, lhs, rhs);
    } );
    Kokkos::fence();
    checkCuda(); // returns cudaErrorMisalignedAddress

If it's not a compiler bug, I have no idea where the actual bug could be. This dot implementation is definitely calling TeamDot::team_dot in KokkosBlas1_team_dot_spec.hpp, and this takes actual views as input, not raw pointers. I checked that the X and Y passed to this functor are correctly aligned too.

Disabling KOKKOS_ENABLE_COMPLEX_ALIGN does make the test pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant